Build a local and offline-capable chatbot with WebLLM

February 14, 2025

Now that you better understand client-side AI, you’re ready to add WebLLM to a to-do list web application. You can find the code in the web-llm branch of the GitHub repository.WebLLM is a web-based runtime for LLMs provided by Machine Learning Compilation. You can try out WebLLM as a standalone application. The application is inspired by cloud-backed chat applications, such as Gemini, but the LLM inference is run on your device instead of the cloud. Your prompts and data never leave your device, and you can be sure that they aren’t used to train models.

Source: Build a local and offline-capable chatbot with WebLLM  |  web.dev

You might be taken aback, as I was a few months ago, about what can be done in the browser with large language models–not calling external APIs but models actually running in your browser.

Want to get started? This is a great place to do so from Christian Liebel.

Then sign up for our upcoming online conference Inference, where we’ll go deep into the topic.