WebML: State of Machine Learning for Frontend Developers

Shivay Lamba at Summit '23

Introduction and Overview of AI Integration for Web Developers

An overview of the session, touching on the importance and relevance of AI in today's technological landscape, and how it can be leveraged by front-end and full-stack developers without needing extensive knowledge in mathematics or machine learning. The speaker, Shivaya Lamba, introduces the topic of WebML and demonstrates the use of AI models in web applications using JavaScript.

Demonstration of AI Models in Web Applications

A demonstration of integrating AI models into web applications is shown, specifically using AI for real-time interaction and pose detection. This section showcases the simplicity and accessibility of using pre-trained models for developers, highlighting TensorFlow.js as a tool for integrating machine learning capabilities into JavaScript projects.

Privacy and Local Processing with AI Models

The speaker discusses the benefits of processing data locally through browser-based AI models, emphasizing privacy and cost-efficiency. This section highlights the use of TensorFlow.js for leveraging pretrained models and the potential for developers to implement AI without compromising user data privacy.

Powering Browser-based AI with Advanced Technologies

This chapter delves into the technical foundations that enable AI in the browser, such as WebGL, WebAssembly, and WebGPU. The speaker explains how these technologies allow for high-performance AI operations within web applications, bypassing the limitations of JavaScript for complex computations.

Real World Use Cases and Adoption by Companies

Successful implementations of AI in web applications by startups and large corporations are highlighted. This includes examples from healthcare, image optimization, and AR face recognition, showcasing how TensorFlow.js and other JavaScript-based machine learning libraries are being leveraged in the industry.

Web Workers for Performance Enhancement

An introduction to the concept and advantages of using web workers for running AI models is provided. The speaker demonstrates a sentiment analysis application that utilizes web workers to offload AI computation from the main thread, illustrating a practical use case of enhancing web app performance with AI.

Closing Thoughts and Resources for Learning

The session concludes with encouragement for web developers to explore and integrate AI capabilities into their projects, dispelling the notion that AI is exclusive to data scientists or Python developers. The speaker shares resources for further learning and emphasizes the promising potential of AI in web development.

I hope everyone is having a wonderful day.

It's been really wonderful for me as well.

It's my first time in Australia and I'm super excited to be presenting about a favorite topic of mine.

And that's super relevant today because of course we have seen such a huge rise in popularity of AI models and just the ability for people to be able to interact with AI models with the help of chat, GPT and other such tools.

So it makes sense that front end developers can also now start to target and use AI models and not have to be scared that, they would need to learn things like mathematics or some other things, along the way.

I'll be kicking off today's session around WebML and how all of you who are front end developers, or even full stack developers, can leverage some open tooling or even some APIs in order to build full stack AI applications in your web projects with JavaScript.

It's a quick introduction.

I'm Shivaya Lamba, JSIG and Working Group Lead from India, and so far it's been great.

I got a day off yesterday to just roam in Sydney, so it's been wonderful.

Now, of course, we know that Halloween is also coming up, so I had some ideas that how would Halloween be celebrated in Australia.

These are some fun things that I found online, that there's some koala bears and then of course we have, the kangaroos.

I thought it'd be fun to I hope that this kind of build of fun project, to also highlight that how easily all of you can integrate AI models in your web applications and not have to actually pay for these AI models.

And all of these AI models will run essentially in your browser itself, locally in your browser.

So of course, the last talk was about using AI to help you code, but this, this talk is how you can leverage AI models in your web applications without any AI knowledge as well.

So drawing inspiration from Halloween in Australia, I created a fun project yesterday, and this is what, how it looks.

So I'll just quickly go ahead and start to run this.

And again, this is all running in the browser, so no secret sauce.

It's just a simple JavaScript application, and I'll just open this up quickly.

So this is me, and I have a Koala bear mask, as you can see, and a skeleton I can dab.

And as you can see, all of this is happening in real time.

I'll probably also be able to show you the legs, and they work pretty well.

So what you can see is that building something like this is not complex, so it does not require you to know a lot of the mathematics that goes behind machine learning.

So the model that's powering this particular example is essentially just a body pose detection where it basically like the machine learning model is able to detect the different joints in your body.

And essentially we call these as different landmarks within your entire body.

And I'm just using some basic mathematics that as I'm rotating my hand, these artifacts that you see, so all the different skeletal artifacts are essentially PNG images.

And I'm just overlaying on, these, on these body parts.

And that's just some basic mathematics that's involved.

But again, the idea is that I could run this within, let's say, 200 to 300 lines of code, and if I just quickly show you how the code actually looks like, so let me quickly go over to this particular code base.

So over here, what you'll see is.

And the last one is, that I essentially, I'm going to be using the pose detection model.

And I'll be talking more about TensorFlow.JS, which is the library that I'm using.

And the specific model that I'm using is pose detection.

So this pose detection model is also actually used quite a lot, in a lot of healthcare startups that have started to use on device machine learning.

And they are not like keeping your data on a private server.

All of your data, healthcare data, is actually being used there in your, locally in your browser or in your device itself.

And you can do things like rep counts.

So there are a lot of AI startups that are leveraging these pre built models.

And again, you don't need to know about any neural networks or all of that technical jargons related to machine learning.

It's just simple JavaScript that I'm using.

And of course, I'll initialize my camera and as soon as I get my camera, I will basically start to use my model itself.

And over here, what I do is that I detect my initial pose through which it is able to now detect all the different landmarks on my body.

And then what I do is I just superimpose these images.

So you'll see that I have a bunch of And I'm just superimposing by the different points that the model is able to detect on my hand in real time.

And then I am able to do that.

So I had another like kangaroo, so I'll actually start, I'll basically delete this.

And I'll, use the kangaroo one just for, fun's sake.

So I think this should be good and I think we should be able to now see a kangaroo face.

So yeah, that is a quick demonstration of what is capable.

And again, The main logic that you'll see over here, which is again, I'll be happy to explore more about, how this code works for all of you in case you're like interested.

But all of the main skeleton structure is basically residing in this particular code base where I'm mainly just superimposing my entire, different parts of my skeleton on the actual, human body, right.

Which I'm able to detect.

And of course the machine learning model is doing all of this, powering up all of these different, landmarks that I see on my body.

So that is a quick demonstration.

I'll quickly close this one, but, yeah, that is my inspiration for Halloween.

And I hope that, you, all of you enjoy it.

And if you want to try it out yourself, I'll probably not be able to visit a, Halloween party, but I decided to just have a bit of fun on my own over here at the conference with Halloween.

Because in India, we don't really celebrate Halloween, at all.

So that is a fun project to work on yesterday.

Now of course, I would want all of you, and I think probably what might happen is that, I would have asked all of you on Slido, that Have you ever actually worked as a JavaScript developer with AI.

But I think we can probably just have a quick raise of hands.

So how many of you have worked as a, as a JavaScript developer with AI tools?

So like with chatGPT.

So I guess quite a few, because we had the last talk about that.

And I guess my second question to all of you is, how many of you have probably interacted with software developers who are building machine learning models in Python and then having to write some, APIs through which you will basically use those APIs that call an endpoint in Python, and then you're able to use them in JavaScript.

So how many of you have actually done that.

So again, a few, hands are being raised and how many of you have actually leveraged or used, AI models directly in JavaScript or like built models in JavaScript.

So a few folks, like very few, like just, a couple of hands raised.

So I would, the main target for me in this particular talk would be to have a lot more hands being raised for being able to leverage these AI models directly in JavaScript.

So that's one of the main objectives for me personally.

And I hope that I'm able to justify that.

Of course, I've already mentioned this earlier that we are in a very, innovative space right now where you're seeing AI toolings literally everywhere, right.

Every particular use case that you think about, there are some or the other AI application being developed over there.

So whether it's chatGPT or mid journey for that matter.

And of course, as a JavaScript developer, I don't want to personally, you learn another language stack all together.

And this is something that kind of started off for me personally.

I started my machine learning journey when I was in my fourth, four year of college, and I started with learning Python and learning a framework like Flask, where I would train the model and then of course create an API endpoint in Flask.

And then use that with a front end application.

But of course, if I had the capability of not having to learn Python and just have everything in JavaScript, that is what I really want.

And of course, that also goes without saying that JavaScript is the most popular programming language.

No, no doubt about that.

And of course that also makes it very fairly popular because you can actually run JavaScript on a host of different types of devices.

So then you're looking at the browser, in servers, in mobile applications, or even IoT.

So you can actually run Node JS on Raspberry Pi.

So what that essentially allows us to do is that, bring in all those AI capability, by having this universal build once and run everywhere.

And you could technically just write one JavaScript application with a machine learning, model inside of it, and then just run it everywhere.

Everywhere, anywhere that you want, right.

Without having to worry about being able to support that particular type of device architecture for running your machine learning models because of the help of JavaScript.

And these are some of the top 50 generative AI products.

I bet like a lot of you might have used some of these.

Of course, these do include like Google Bard, chatGPT, like all of these were covered, which kind of help you in, doing like programming, but of course, A lot of these are also AI tools that are, being used as productivity tools.

And whereas it would have taken you hours and hours at a time to build certain things, these AI models and these generative AI companies allow you to just make it super easy to build something, right.

So whether it's like.

Notion, including Notion AI, which allows you to, just, create summaries or write for you, or it's it's chatGBT, right.

So of course, it's clear by now that we are in a AI era and whether or not we, we like, debate, it's debatable that, the AI hype, is it real or not, but.

Let's face it that we are in that kind of phase where AI is being used by literally every other company.

So there's no shying away.

And in fact, like for becoming more productive, AI is the right way to go through.

So of course, now you could start by using some external APIs, including OpenAI, JavaScript SDK.

So there is The chatGPT API is that can actually directly used.

You just have to create an account on OpenAI, create an API key and start to use that.

Although it is paid for use, also you can use Replicate.

It's another online tool or service that uses a lot of different open source models, but of course you don't have to run these locally or you don't have to actually train them on their, on your own.

You just pay for the service.

Get an API key, choose what type of model you want to use, and then you can start to build applications.

I will come back to this particular aspect a bit later, but of course, the biggest drawback of using something like this is that A, you are sending all of your data to OpenAI or to Replicate, and you're using their servers in order to generate results.

There might be a lot of use cases where you want , you are working in a company which is super privacy focused, and you don't want to share your data and you want everything to be processed locally.

So for that, we have a number of different, open source tooling that is available for us.

So if you wanted to actually do all this AI inference directly inside of your front end, in your browser itself.

There are a lot of different open source libraries including Onix, TensorFlowJS, MediaPipe, TransformerJS.

And again, like, all of these, some of you might have heard of these, some of these terminologies might be completely new for you.

But, I would be hoping that at least by the end of this talk, you are able to get a basic idea of, as I'll introduce some of these libraries to all of you.

Primarily focusing on the TensorFlowJS library.

It is started in, back in 2018, and it's an open source machine learning library by, Google, and primarily designed for JavaScript.

Of course, it extends from the core TensorFlow library that's a very popular machine learning framework for building deep learning neural network applications.

This is essentially the JavaScript interface, and you can just see by this graph on NPM that over the past one to two years, it has skyrocketed with more than 250,000 NPM installs.

So that just goes on to showcase that how popular this is actually becoming.

And I'll be showing you some examples of real world companies, both large MNCs and startups, who are actually leveraging, these kind of open source libraries in order to build AI capability in front end web applications.

Of course, apart from this, if we focus on what exactly is the TensorFlowJS library, of course, I've given you like basic overview, but of course, the biggest use case for running this inside of your browser is that a, you're not actually, having your model in a dedicated server.

So you are saving up a lot of your server costs.

You're having low latency as well, because of course, you don't have to wait for an API call to go all the server and then wait for the response to be generated and be shown to you.

And of course, the biggest one is the privacy, right.

Because these models are executing right inside of your browser.

They're not being hosted somewhere in a remote server.

So everything that And that is there, like any processing of your data stays locally in your machine itself.

And is being leveraged with the help of a lot of the browser capabilities.

Of course, today our browsers are extremely capable and we'll be seeing some of the ways in which we our browsers uses things like WebGL, WebAssembly, all of these to be able to run some of these more powerful models.

Of course, but that does not mean that you cannot use this with NodeJS.

If you are having a lot of like large models that probably might not be able to run directly in a browser, there is support for NodeJS as well.

So you can use like dedicated GPU CPUs for being able to train and, do inference with larger machine learning models that can run on NodeJS.

And also actually benefit from the ahead of time compilation, just in time compilation that is offered by NodeJS.

And in fact, for a lot of popular models, NodeJS actually performs better in comparison to even with Python, which is like considered to be the de facto language when running machine learning models.

So there's a lot of great benefits that you get with JavaScript, and how you can combine AI capability or AI models with JavaScript.

And primarily, TensorFlowJS allows you to use the entire library in three different ways.

So the first one, which is of, most importance for a lot of like front end developers is using existing models.

So these are pre trained models.

And what I mean by pre trained is that these models have already been trained on a dataset.

And you can, they're just, ready off the mill and you can just integrate them inside of your applications without having to worry about training them.

So this entire like data scientist analogy of training a machine learning model all of this stuff does not apply to these pre trained models.

You can use them right off the mill, include them in your application and start using them.

Apart from that, there is also something known as transfer learning.

So in the entire ML ecosystem, transfer learning basically means that you have an existing model, but you now retrain that model over your own data.

And the reason could be that, Hey, your data could be very custom.

And if you want the model to work better with your own custom data, then you can also use something like retraining of your model.

But of course, that would be something that would be more advanced.

Or if you, of course, you can write your own models directly in JavaScript itself.

You would be writing them in Python.

There are a lot of APIs provided by the TensorFlowJS library that allows you to write these models directly in JavaScript.

So if you are more inclined and you're more like, focused on machine learning as well, you can do that with TensorFlowJS.

But looking at some of the pre trained models, and I'll be giving an example and a code walkthrough of one of these.

Of course, there are like, range of different models.

You can just go to tensorflow.org/js/models to look at all of these pre trained models.

All of these are open source because it's an open source library.

So of course, a lot of like common, types of AI applications that you will come across, including like vision, being able to do image detection or even like the, the human body.

So the so pose estimation, one that's highlighted in that, it's bolded.

That is one actually one of the, one of the machine learning models that I actually used, which is the pose detection, model, which is a pre trained model.

So again, I didn't have to write any custom AI logic myself, right.

Because I was just using this human body pose detection, model, which allows you to detect different landmarks in your entire body and you can make different poses and it will be able to detect that.

Or, of course, you can, do some basic text classification.

I have a demo planned ahead, that will be demonstrating this, and some, some other ones.

Of course, one of these examples is Facemesh, which essentially is able to capture more than 400 different facial landmarks.

And this is actually used in L'Oreal for AR face recognition, and you're able to do custom lipstick tryout by using this face mesh model, which is able to uniquely detect your lips, and you can just change the color of the lipstick and it will show you on your face.

Another example is Coco-SSD.

If you're from an AI background, or even if you're not, Coco-SSD is one of the most basic vision based models that is essentially trained over different classes.

So you are able to detect unique things, for example, humans or like mobile phones or whatever.

So we'll take a look at a quick, look at an example of a Coco-SSD model.

How you can integrate that inside of your application in JavaScript.

And how the output actually would look like.

We'll start by installing certain libraries.

Now, if the example that I'll be showing right now is a very basic, like HTML, CSS, JavaScript example, but of course you can also use this with NPM modules.

So what do you see is that we are having three different CDN scripts.

The first one is for the TensorFlowJS itself, and then the next one that we are importing is the actual Coco-SSD pre trained model, right.

So these are the two, two different seeding scripts in a React application or in a TypeScript application, you could just do npm install, Tensorflow/TensorFlow.JS.

and then tensorflowmodels slash Coco-SSD.

Once we do that, now the main point over here is that the actual model itself, you have to wait for the model to actually load inside of your web browser.

So there is a cool off time where the model basically loads inside of your browser and it actually stays in the browser memory itself.

That is why, we'll write some logic over here where we use the Coco-SSD function that's coming from our Coco-SSD library, and we just wait for the model to load, because otherwise if you start, if you try to run it, then it will not actually run.

So we just wait for the model to load.

And once it's done, then we can proceed with the rest of our application.

The next thing that you'll see is that we are simply uploading an image.

So the simple JavaScript logic, and the main thing that I'd like to highlight over here is probably the last two to three lines where we use model.detect.

So in this model.detect, this is a custom function that comes built in with our, the Coco-SSD model.

So the model.detect is basically taking an image.

And trying to detect what's there inside of that image, right.

So it'll run some predictions.

It'll try to identify the spec specific, coordinates within that particular image if it's able to detect one of the existing classes.

For example, if it is able to detect a human, it'll make a prediction and kind of show you those particular coordinates of where that human is located inside that image.

So again, I didn't have to learn anything about how this Coco-SSD model is running.

I simply just used this pre trained model and used the object.detect function to be able to detect the actual, result.

And then what I just do is I draw a rectangle or a bounding box.

So this bounding box kind of generates from the primarily, the different coordinates that you get.

So the four different coordinates that you get, we are just rendering them instead of a circle to showcase that.

And, then we are also just drawing a box.

So essentially what our AI model would do is that from the predictions that we get from our model, we are just drawing a rectangle on, the actual image, right.

Similar to how we see in this particular, example, where we see that there's a dog and then we have created a rectangle with the specific, key points or the coordinates of the detected object that has been detected by the model.

So we're just writing the code logic for that.

And again, what do you see is that most of this is like standard, like canvas and JavaScript.

So the actual AI logic is just two to three lines, at least in this example.

And then we just describe the label.

And what essentially you get up, end up, getting is, of course, like I was able to get a picture with John and I'm not sure if he's here.

But it is like really cool to meet him.

And you can see that, I was able to get these person detection where it is able to detect, it's a person.

Of course, you could also do this in real time with like running a web webcam.

In this case, I didn't integrate a webcam.

I simply just had a simple file uploader, but of course you can do whatever you want in this example.

But this is a simple example of like, how you can use a pre-trained model.

But now we'll proceed further and understand like some of the other, ways in which you can interact with Web ml.

And of course, one of the biggest points is the building blocks of how we are actually able to run these models in the browser, Yes, JavaScript is amazing, but JavaScript is not perfect.

In comparison to programming languages like C++, Java, Rust, JavaScript does fall short in terms of, the performance, like the raw performance that you get with JavaScript is definitely lacking.

And when it comes to complex mathematics.

And of course, machine learning is all about complex mathematics, a lot of calculations.

JavaScript cannot handle it on its own, right.

So typically, we have to use technologies like either WebGL, WebAssembly.

All of these are different type of technologies that, for example, in this instance, WebAssembly allows you to take any like code logic written in Rust or C++ and compile it down into this WebAssembly format and then run that inside of your browser.

And that WebAssembly module itself is essentially doing a lot of the heavy lifting for you.

And WebAssembly is also used for like in browser video editing, which, until a certain point in time was not even possible to even think about that it's possible to do in browser video editing.

So you can leverage tools like web webGL, which is primarily used for a lot of like 3D rendering, that essentially allows you, allows your browser to basically interact directly with your system resources, like your CPU or your GPU of your machine and leverage those system hardware to accelerate or, do the AI inferencing faster.

So when you're writing JavaScript applications and you're leveraging machine learning models, you can actually choose which type of backend that you want and over here, like backend does not mean a server, but essentially how your browser can interact directly with either using a web, web GPU or web ML, or, WebAssembly.

And WebGPU is also one of the most, recent, announcements that was just made, in the Google IO this year, which is made like, it is in an experimental stage before in Chrome browser, in the Chrome browser, but now it's generally supported.

So a lot of AI capabilities, like where people are actually using Rust, programs.

They're basically compiling down the Rust programs into JavaScript and then using WebGPU to be able to do like real time, text to speech recognition as well.

An example for that over here, I like to quickly point out is, what you can probably see is I've all this time, probably you didn't see, that I've been doing a hundred percent local audio translate translation right now, as I'm speaking in real time, it is able to transcribe the audio.

And all of this again is happening locally in your system.

This is of course a dedicated link, but, this is using something known as transformers.js.

And the Whisper API, OpenAI Whisper API is like very famous, but all of this transcription that you see right now happening is being happening live locally in your system itself, right.

So even if you don't have access to the web browser or, to, to an external service, all of this happening locally in your system.

So this kind of shows that how far we have actually gone in order to be able to leverage some of these existing models and very, really create powerful web applications.

So that is like the building blocks of what really powers all of these different models in your web applications.

And of course, speaking about like some real world use cases.

So Include Health is one such very popular startup that uses TensorFlowJS.

And you can see it is able to do, it's essentially using the pose detection model and a different, bunch of like pose models, which allows you to do things like rep counts, or if you're like doing a custom activity, custom, exercise, you can track all of that again, locally in your system itself.

Then of course, Adobe very recently, launched the Adobe Photoshop for web, and that also does use, TensorFlowJS for things like object detection.

LinkedIn also does use, TensorFlowJS, but primarily for Node in order to optimize the images that get rendered on the LinkedIn platform.

So of course, like these are some examples where you can see that even large companies are leveraging these libraries in order to build, very performant machine learning capable applications.

And of course, I've covered this particular slide before, but just to summarize, why is it like great to be able to run these AI models directly in your browser is primarily the low cost and of course, being able to done, do everything inside of your browser itself.

And of course, like the scale is great because.

You don't have to worry about paying a cost for the server.

Everyone who uses a Mac or any other laptop today has a capable enough device that can run most of these models.

So the scalability is not an issue at all, right.

Most modern browsers, most modern CPU architectures basically support a lot of these models that, I'll be showing and know some of the examples.

And for of course, like if you're interested to learn more about TensorFlowJS itself, there's a free course on YouTube, on the zero to hero tensorflow js course.

I will definitely, recommend all of you to check this out.

If you are motivated enough to try out machine learning in the browser today after this particular talk.

And of course, I gave an example of the TensorFlowJS library, but there are a lot of different other libraries as well.

So things like langchain, if you're interested in like being able to run local large angle models, LLMs in the browser itself.

These are some of the tools that are capable of doing that.

And of course, you get some native browser APIs that have AI capability.

So you're not just like limited to the TensorFlowJS library, which I have been talking about so far.

So of course, some of the examples include diffusion, is like very popular AI, a terminology generative AI model that you know, is there, but, we have some capability today that with the help of ONNX and WebGPU, you can actually use these diffusion models directly in the browser.

Apart from this, of course, if you're like interested in large language models, there is like a lot of support where you can use, langchain.

js, which is the JavaScript client for langchain.

And then build AI capable applications.

So typically, if you're using something like OpenAI, you can, leverage OpenAI and Langchain to create, bots, chatbots, that, question and answering, question and answer powering, chatbots directly in your, web application by using, OpenAI applications and, Langchain.

And, of course, If you want to try to run large language models directly in your browser, these are something that's still in an experimental phase, but we have been able to find certain success by being able to actually run these large language models directly inside of the browser itself.

I'll definitely recommend you to check this out if you're interested.

But again, just showing you like how far we have gone ahead and, how capable AI has become just running inside of your browser itself.

Of course, you can also use something like OpenAI and Replicate.

I highlighted these, earlier in my session.

And perhaps some of you might have heard of, Nutlope on Twitter.

He's, he became extremely popular.

He made some really wild, very famous, applications.

And again, a lot of them like did not have a lot of AI code itself, right.

It is using Next.js, the Edge functions within Vercel, Vercel Edge functions, and then the OpenAI API in order to build things like [saloon?] GPT..

And of course, here's like a code example where what you'll see is, that initially we just use the OpenAI API, and then use one of the models in this case, like the example that you see is the example for a code and of course, chat completion.

So we just use a chat completion model where we determine like we are using a GPT 3.5 and then we just wait for the response to load.

So again, no, knowledge of AI required.

You just want, you just have to use the AI model itself and the API endpoint that you wish to use.

And of course, I think as web developers and as JavaScript developers there's something that I've personally experienced having spoken to a lot of folks and which is, that we don't really, truly, appreciate the power of web worker threads.

How many of you do use web workers, in a day to day basis.

Or, at least in your work.

I guess like fairly less folks, but web workers are a really great way for being able to build performant applications.

So the entire idea really is that you're not leveraging the main worker thread that is powering your most of your JavaScript, but essentially you're offloading a lot of your capabilities to these, workers, which are basically running in parallel to your main thread.

So essentially, if you're not, let's even using something like WebGPU or WebGL or WebAssembly, you could use WebWorker threads and the WebWorkers will basically, take the entire capability of loading your model and training your model and like essentially doing the inference directly from these, parallel threads that are running.

And your main application is independently running and you don't have to worry.

Like all the heavy lifting is being done by these worker threads.

And in an example for that, I'll quickly show you, like it's a sentiment analysis example, and I'll quickly just run this over here.

So I think it's ready.

So essentially it's just a simple, application and just go ahead and run this now.

So once again, of course, live demos always give issues.

So let's see if I'm able to get this up and running.

I hope that it runs, but the idea really is, okay, let's wait for it to run.

And I'll go ahead and run this again.

Hopefully this works this time.

Okay, perfect.

So over here, I'm using the transformers.js library for client side, real time client side, sentiment analysis.

So let's say, I'll say I am, Okay, let's just wait for it to load and, within a few seconds, I should be able to describe my feelings over here and add like a description.

So let me just go ahead and, because there's some weird bugs these days with, Chrome where like it does not, allow you to select it, but yeah, now let's say I'm scared.

So it is able to give you like that negative response.

But if I say I am excited to be in Sydney and presentat Web Directions.

So you can see that it is able to give a real time label of positive.

And the way that we are basically working with this is, and I'll quickly just show you the example of the code.

So the entire, essentially the entire heavy lifting of the AI inferencing is being done on my worker thread.

So here I'm using the text classification model that's powered by the transformers.js.

And I'm running a classifier where whatever text is being sent in that text box, that is being, basically streamed directly to my web worker.

And it is able to take that message, in, inside of the message stream, and then classify that text, whether it's positive or negative.

And over here in my NextJS application, it's a simple Next.

Js application where I'm basically just using a use effect.

That is, essentially setting up my worker, thread and, sending that text that I'm getting into my input box and just using that.

And I'm just using some basic state management to render the actual results, right.

So again, not a lot of simple, like simple JavaScript logic that I'm using, but of course I just wanted to quickly highlight the capability that is offered by the web workers, right.

Because that's an incredible tool.

You can, all of, you can leverage that as well for AI capability.

And of course if all of you're interested to learn as I'm just closing in on the presentation, feel free to have a look at some of these resources.

These are primarily focused on, getting started with TensorFlowJS, but of course the entire like WebML, ecosystem as well, if you're interested.

But of course, what we are here really witnessing, is WebML, right.

Which is, Machine learning meant for web technology, and of course, for JavaScript developers, for TypeScript developers alike.

I think, what I'd like to close my thought for this presentation is that, I would like to, just end this notion that machine learning is just for Python developers, but today I feel that because of the capabilities that is being offered by, JavaScript, I feel that within the next few months, or even like within the few weeks, we will definitely see a more, bigger rise in AI enabled JavaScript applications and more AI being adopted, or like more JavaScript developers adopting AI.

And that is the convention that we want for us to adopt.

So I hope that everyone today, at, sitting over here is able to, get inspiration from this particular talk and just build AI enabled applications with JavaScript.

And of course, before I conclude, I just wanted to quickly, I just wanted to show just one final example, which is essentially, being able to, run local large language models directly in the browser.

I'll definitely recommend for all of you to check this out, but the idea is that you can, give any PDF document.

In this case, let's say I take this document, from my local machine.

Again, I'll have to probably, resize my Chrome because it does not seem to be working that well.

But let's say I'll choose a PDF file over here.

I'll just take one example and what you'll be able to do is that even if you run this application locally, without any internet access, you will be able to actually get summarization from your PDF.

So you can actually ask it any question that you want.

As well, which is typically you would use something like chatGPT where you will give it a context and you will ask it certain questions.

So over here, everything is actually happening locally in your browser and you can ask it any questions, regarding this particular document.

So of course, like that's, that is the final, example or demo that I wanted to quickly demonstrate.

But, overall, if you're interested to look at the slides, I have, shared the QR code.

So feel free to, quickly scan this if you're interested.

And of course, you can connect with me on Twitter at @howdevelop .

I'll be more than happy to have discussions, with all of you later today.

And of course, tomorrow as well, if you have any questions, but open to questions.

Now, thank you.

WEBML: STATE OF MACHINE LEARNING FOR FRONTEND DEVELOPERS

WebML: State of Machine Learning for Frontend Developers

Shivay Lamba, TensorFlowJS SIG & WG Lead, TFUG Organizer
@howdevelop
#WebML

Sydney has been a lot fun

A person standing in front of the Sydney Opera House with crossed arms, smiling, on a sunny day.

Halloween in Australia

Two illustrations side by side: on the left, two koalas next to a carved pumpkin, and on the right, a cartoon of a kangaroo startled by a bat, with a full moon and bats in the background.

Demo of an app that takes the camera feed and super imposes a koala's face and skeleton on a person's image

A screenshot showing a code editor with a dark theme, displaying JavaScript code related to initializing camera and pose detection for a web application, including functions initCamera, initPoseDetection

screenshot of the code for the app

Screenshot of code in a dark-themed integrated development environment.

A person on a stage with a digital overlay showing a kangaroo's head and a skeletal torso.

Have you worked with AI/Machine learning as a JavaScript Developer

Buzz lightyear says "deep learning, deep learning everywhere" to Woody from toy story.

As a JavaScript developer you probably don’t want to learn additional language stacks like Python!

A photo of a displeased-looking cat with the caption "NO." indicating a negative response to learning additional language stacks like Python.

A presentation slide with five columns, each representing a platform type: Browser, Server, Mobile, Desktop, and IoT. Each column has representative icons or illustrations above the platform name. For the Browser column, there are icons for Chrome, Safari, and Firefox. Server shows a Node.js logo. Mobile has icons for React Native and PWA. Desktop has the Electron logo, and IoT features an image of a Raspberry Pi board with the note "(via

A slide presenting five different application platforms with corresponding logos and technologies. From left to right: 'Browser' with Chrome, Safari, and Firefox logos below; 'Server' with a Node.js logo; 'Mobile' with React Native and PWA logos; 'Desktop' with an Electron logo; 'IoT' with a Raspberry Pi and Node.js annotation.

Top 50 GenAI Web Products, By Monthly Visits

table of 50 generative AI products, from CatGPT to Deepswap

Use external APIs like OpenAI JS SDK, Replicate to integrate in your JavaScript application!

Screenshot of a webpage promoting 'roomGPT' with a header saying 'Generating dream rooms using AI for everyone.'

Meme image featuring a man Morpheus form the Matrix and the caption "WHAT IF I TOLD YOU YOU CAN MOVE YOUR ML PIPELINE TO THE FRONT END"

TensorFlow.js is a library for machine learning in JavaScript

ML in the browser / client side means:

Lower latency
High privacy
Lower serving cost

Also support Node.js for server side and IOT devices

Run, retrain, write

Reuse existing models, or create your own

Run existing models
Pre-packaged JavaScript or
Converted from Python or TF Lite
Retrain existing models
With transfer learning
Write models in JS
Train from scratch

Pre-made models

Continually optimising or expanding our collection

Vision
- Image classification
- Object detection
Human Body
- Body Segmentation
- Pose Estimation
- Face Landmark Detection
- Hand Pose Estimation
Text
- Text Toxicity
- Sentence Encoding
- BERT Q&A
- Conversation Intent Detection
Sound
- Speech Command Recognition
Other
- KNN Classifier

Face Mesh

Just 3MB in size
Recognize 468 facial landmarks

Three images depicting face mesh technology: left and center images show two different people with their faces overlaid with a green grid illustrating the mesh, and the right image shows a woman's face with virtual makeup applied, alongside options for different shades of lipstick labelled 'Color Riche Satin 268 Garnet Rose' and a 'Demo' button.

Object Recognition

Using COCO-SSD

Trained on 90 object classes

A slide showcasing object recognition software identifying a dog in two separate images with transparent green bounding boxes and percentage confidence levels displayed above each box.

Code

<script src="//cdn.jsdelivr.net/npm/tensorflow/tfjs/dist/tf.min.js"></script>
<script src="//cdn.jsdelivr.net/npm/tensorflow/tfjs-automl/dist/tf-automl.min.js"></script>
<img id="daisy" crossorigin="anonymous"
src="//storage.googleapis.com/tfjs-testing/tfjs-automl/img_classification/daisy.jpg">
<script>
async function run() {
const model = await tf.automl.loadImageClassification('model.json');
const image = document.getElementById('daisy');
const predictions = await model.classify(image);
}
run();
</script>

Installing necessary libraries


<script src="https://code.iconify.design/1/1.0.7/iconify.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/coco-ssd"></script>

Waiting for the model to load

var model;
cocoSsd.load().then(
function (res) {
model = res;
alert('Model is ready');
},
function () {
console.log('model did not load');
}
);

Uploading an image and model prediction


function upload_image() {
	const canvas = document.getElementById('canvas');
	const ctx = canvas.getContext('2d');
	var input_elem = document.querySelector('input[type=file]');
	var file = input_elem.files[0];
	const image = document.getElementById('img');
	var reader = new FileReader();
	reader.addEventListener(
		'load',
		function () {
			image.src = reader.result;
			setTimeout(function () {
				if (image.height > 500) {
					image.width = image.height * (500 / image.height);
					image.height = 500;
				}
			}, 1000);
			model.detect(image).then(function (predictions) {
				draw_res(canvas, ctx, image, predictions);
			});
		}, 1000);
	},
	false
);

Drawing Results


function draw_res(canvas, ctx, image, predictions) {
	canvas.height = image.height;
	const font = '16px sans-serif';
	canvas.width = image.width;
	ctx.clearRect(0, 0, ctx.canvas.width, canvas.height);
	ctx.drawImage(image, 0, 0, ctx.canvas.width,
		ctx.canvas.height);
	ctx.textBaseline = 'top';
	ctx.strokeStyle = '#00FFFF';
	ctx.lineWidth = 3;
	ct.fillRectStyle = '#00FFFF';
	draw_box(ctx, predictions, font);
	draw_label(ctx, predictions);
}

Drawing the label

function draw_box(ctx, predictions, font) {
	console.log(predictions);
	predictions.forEach((prediction) => {
		// predictions = [{bbox: [10,20,300,50]}]
		const x = prediction.bbox[0];
		const y = prediction.bbox[1];
		const width = prediction.bbox[2];
		const height = prediction.bbox[3];
		ctx.strokeRect(x, y, width, height);
		const textWidth = ctx.measureText(prediction.class).width;
		const textHeight = parseInt(font, 10); // base 10
		ctx.fillRect(x, y, textWidth + 4, textHeight + 4);
		ctx.fillText(prediction.class, x, y + textHeight);
	});
}

function draw_label(ctx, predictions) {
	predictions.forEach((prediction) => {
		const x = prediction.bbox[0];
		const y = prediction.bbox[1];
		ctx.fillStyle = '#000000';
		ctx.fillText(prediction.class, x, y);
	});
}

Object Recognition

Using COCO-SSD

Trained on 90 object classes

Description: A photograph of a dog outside a house with two overlaid green boxes, each box labeled "dog" with a percentage of confidence, demonstrating object recognition technology.

Drawing Results

function draw_res(canvas, ctx, image, predictions) {
	canvas.height = image.height;
	const font = '16px sans-serif';
	canvas.width = image.width;
	ctx.clearRect(0, 0, ctx.canvas.width, ctx.canvas.height);
	ctx.drawImage(image, 0, 0, ctx.canvas.width, ctx.canvas.height);
	ctx.textBaseline = 'top';
	ctx.strokeStyle = '#00FFFF';
	ctx.lineWidth = 3;
	ctx.fillStyle = '#00FFFF';
	draw_box(ctx, predictions, font);
	draw_label(ctx, predictions);
}

Uploading an image and model prediction

function upload_image() {
	const canvas = document.getElementById('canvas');
	const ctx = canvas.getContext('2d');
	var input_elem = document.querySelector('input[type=file]');
	var file = input_elem.files[0];
	const image = document.getElementById('img');
	var reader = new FileReader();
	reader.addEventListener(
		'load',
		function () {
			image.src = reader.result;
			setTimeout(function () {
				if (image.height > 500) {
					image.width = image.height * (500 / image.height);
					image.height = 500;
				}
			})
			model.detect(image).then(function (predictions) {
				draw_res(canvas, ctx, image, predictions);
			});
		}, 1000);
	},
	false
);

Drawing the label

function draw_box(ctx, predictions, font) {
	console.log(predictions);
	predictions.forEach((prediction) = & gt; {
		// predictions = [{bbox: [10,20,300,500]}]
		const x = prediction.bbox[0];
		const y = prediction.bbox[1];
		const width = prediction.bbox[2];
		const height = prediction.bbox[3];
		ctx.strokeRect(x, y, width, height);
		const textWidth = ctx.measureText(prediction.class).width;
		const textHeight = parseInt(font, 10); // base 10
		ctx.fillRect(x, y, textWidth + 4, textHeight + 4);
	});
}

function draw_label(ctx, predictions) {
	predictions.forEach((prediction) = & gt; {
		const x = prediction.bbox[0];
		const y = prediction.bbox[1];
		ctx.fillStyle = '#000000';
		ctx.fillText(prediction.class, x, y);
	});
}

TensorFlow.js COCO-SSD Object Detection for WebDirections

the Web Directions logo and below a screenshot of two men with bounding boxes labeled 'person' around them, indicating the output of a TensorFlow.js COCO-SSD Object Detection model.

Building blocks of Web ML

Flowchart showing the building blocks of Web ML with Chrome at the top, leading to JS, WebGL, WebAssembly, WebGPU, and WebNN. Arrows indicate interactions between JS and TensorFlow Lite, TensorFlow.js and MediaPipe, as well as connections to TensorFlow Hub and TensorFlow (Python).

Ermine.ai - 100% local audio recording & transcription

a text box filled with unformatted text tha largely filles a browser page.

Image of a person standing on one leg performing a physical exercise in a home environment with a fireplace and furniture in the background. Two vertical lines and two colored bars at the bottom of the image suggest motion tracking or analysis technology being demonstrated.

Adobe Photoshop Web + WebML

Linkedin WebML

Screenshot showing a user interface of Adobe Photoshop with two red apples with water drops on them, indicating image editing capabilities, and an open blog post titled "How LinkedIn Personalized Performance for Millions of Members using TensorFlow.js"

5 client side super powers

Harder / impossible to achieve server side

Privacy
Lower Latency
Lower Cost
Interactivity
Reach and Scale

Zero to Hero TensorFlow.js course

If you are new to Machine Learning in JavaScript then this is the course for you - learn how to create next gen web apps.

Man in a purple shirt speaking into a microphone with text highlights of the course content including creating next generation web apps, using off the shelf models, and making custom models with your own data.

ONNX

LangChainJS
Transformers.js

Browser APIs for ML

Use Other Alternatives

An emoji with a smiling face and both hands showing the 'hug' gesture located at the top right corner of the slide.

dakenf/diffusers.js

diffusers implementation for node.js and browser

Onnx & WebGPU

Screenshot of a GitHub repository page for the project 'dakenf/diffusers.js', showing repository details.

If you are interested in working with LLMs/Langchain

AI.JSX – The AI Application Framework for Javascript

Screenshot of a web page about AI.JSX, emphasizing it as an AI application framework for Javascript.

Here is a simple example using AI.JSX to generate an AI response to a prompt:

import * as AI from 'ai-jsx';
import { ChatCompletion, UserMessage } from 'ai-jsx/core/completion';
const app = (
<ChatCompletion>
<UserMessage>Write a Shakespearean sonnet about AI models.</UserMessage>
</ChatCompletion>
);
const renderContext = AI.createRenderContext();
const response = await renderContext.render(app);
console.log(response);

A code editor window displaying JavaScript code, including imports and components pertinent to AI.JSX with an example user message to create a Shakespearean sonnet.

Web LLM

Screenshot of the Web LLM page, with a chat demo web interface showing initialization messages

NextJS + OpenAI/Replicate APIs

Two screenshots represented: on the left, a webpage titled 'Generating dream rooms using AI for everyone' with before and after pictures of a room remodel; on the right, a thumbnail for a video titled 'GPT-3 AI APP' featuring a person with a graphic background and text overlay, with video details including views and time since posted.

	import {
	Configuration,
	OpenAIApi
} from 'openai-edge';
import {
	OpenAIStream,
	StreamingTextResponse
} from 'ai';
// Create an OpenAI API client (that's edge friendly!)
const config = new Configuration({
	apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIpi(config);
// Set the runtime to edge for best performance
export const runtime = 'edge';
export async function POST(req: Request) {
const {
	vibe,
	bio
} = await req.json();
// Ask OpenAI for a streaming completion given the prompt
const response = await openai.createChatCompletion({
			model: 'gpt-3.5-turbo',
			stream: true,
			messages: [{
					role: 'user',
					content: 'Generate 2 ${vibe} twitter biographies with no hashtags and clearly labeled "I." and "2.." ${ vibe === '
					Funny ' ?
					'Make sure there is a joke in there and it'
					s a little ridiculous.
					' :
						null
				}
			},
			'Make sure each generated biography is less than 160 characters, has short sentences that are found in Twitter bios, and base them on this context: ${bio}${ 
			bio.slice(-1) === '.' ? '' : '.'
		}
		'
	],
});
// Convert the response into a friendly text-stream

NextJS + Web Worker Threads

Web Worker Threads offer a powerful way to handle time-consuming tasks in the background without blocking the main user interface thread for AI processing. This code runs concurrently alongside the main thread. This worker is responsible for loading and running the AI model, performing text classification, and sending progress updates and results back to the main thread.


import { pipeline, env } from "@exnova/transformers";
// Skip local model check
// @ts-ignore
env.allowLocalModels = false;
// Singleton pattern for lazy construction of the pipeline
// ... (code continues)

Example

Inside the Home component, we initialize state variables using the useState hook. These variables will be used to track the processing progress, readiness status, and classification result.

The useEffect hook is used to set up the Web Worker and its event listener. This is where communication between the main thread and the worker is established.

The onTextChange function is called whenever the user enters text in the Textarea. It sends the entered text to the Web Worker using the postMessage API.

Conditional rendering is used to display the progress bar while processing is ongoing and to show the classification result when ready.

Transformers.js Client Side Processing

A web browser displaying a user interface with a navigation bar at the top, a header reading "Transformers.js Client Side Processing," and a main content area with an input box labeled "Description" above a horizontal rule.

Browser-Side Processing

screenshot of code for the browser-side processing

Browser Side Processing

screenshot of code for the browser-side processing

Learn more

Get started fast!

Website / API: tensorflow.org/js

Models: tensorflow.org/js/models

Github Code: github.com/tensorflow/tfjs

Langchain + React/NextJS:

https://www.tome01.com/integrate-langchain-in-reactjs-streaming-chat-apps https://vercel.com/guides/nextjs-langchain-vercel-ai

Image of the cover of a book titled "Deep Learning with JavaScript". The cover features a pastel illustration of a man holding a surveyor's level and the subtitle "Neural networks in TensorFlow.js".