AI in the Browser: Running Real Models with Transformers.js

You don't need a backend to run AI anymore. Transformers.js lets you run Hugging Face models directly in the browser or Node.js — here's how to get started.

For most of AI's recent history, running a language model meant spinning up a server, paying for inference, and managing API keys. That's still often the right call — but it's no longer the only call. Transformers.js brings Hugging Face's model ecosystem directly to the browser, with no server required. The models run entirely on the client, which has interesting implications for privacy, latency, and cost.

What Transformers.js Actually Is

Transformers.js is a JavaScript port of Hugging Face's Python transformers library. It uses ONNX Runtime Web under the hood, which runs optimized neural network graphs in the browser via WebAssembly (with WebGPU acceleration available in supporting browsers). The API mirrors the Python library closely, so if you've used pipeline() in Python, the JS version will feel familiar.

bash

npm install @huggingface/transformers

Running Sentiment Analysis in the Browser

The pipeline function is the highest-level API — you give it a task and optionally a model, and it handles everything:

import { pipeline } from "@huggingface/transformers";

// Models are downloaded from Hugging Face Hub on first run
// and cached in IndexedDB for subsequent visits
const classifier = await pipeline(
  "sentiment-analysis",
  "Xenova/distilbert-base-uncased-finetuned-sst-2-english",
);

const result = await classifier("The new API is surprisingly elegant.");
// [{ label: 'POSITIVE', score: 0.9987 }]

The first time this runs, the model weights download from the Hub (this particular model is around 67MB, quantized). After that, they're cached and subsequent page loads are fast.

Keeping the UI Responsive with Web Workers

Model inference can block the main thread, which will freeze your UI. The correct pattern is to run the pipeline inside a Web Worker:

// worker.js
import { pipeline } from "@huggingface/transformers";

let classifier;

self.onmessage = async ({ data: { text } }) => {
  if (!classifier) {
    classifier = await pipeline(
      "sentiment-analysis",
      "Xenova/distilbert-base-uncased-finetuned-sst-2-english",
      {
        progress_callback: (p) =>
          self.postMessage({ type: "progress", payload: p }),
      },
    );
  }

  const result = await classifier(text);
  self.postMessage({ type: "result", payload: result });
};

// main.js
const worker = new Worker("./worker.js", { type: "module" });

worker.onmessage = ({ data }) => {
  if (data.type === "result") {
    console.log(data.payload);
  }
};

worker.postMessage({ text: "This library is genuinely impressive." });

The progress_callback lets you show a progress bar during the initial model download, which is essential UX for larger models.

Tasks Beyond Text

Sentiment analysis is just the beginning. Transformers.js supports a wide range of tasks:

// Zero-shot image classification
const visionClassifier = await pipeline(
  "zero-shot-image-classification",
  "Xenova/clip-vit-base-patch32",
);
const imageResult = await visionClassifier(imageUrl, [
  "a photo of a cat",
  "a photo of a dog",
  "a photo of a car",
]);

// Text generation (small LLMs)
const generator = await pipeline(
  "text-generation",
  "Xenova/Phi-3-mini-4k-instruct",
);

// Speech recognition
const transcriber = await pipeline(
  "automatic-speech-recognition",
  "Xenova/whisper-tiny.en",
);

Whisper running in-browser for real-time transcription, CLIP doing image search without a backend, small instruction-tuned LLMs for form assistance — these are all genuinely practical applications now.

The browser isn't replacing cloud inference for large-scale workloads, but for privacy-sensitive tasks, offline-capable features, and zero-latency interactions, Transformers.js opens up a category of AI features that simply wasn't possible before without a server.

AI in the Browser: Running Real Models with Transformers.js

What Transformers.js Actually Is ​

Running Sentiment Analysis in the Browser ​

Keeping the UI Responsive with Web Workers ​

Tasks Beyond Text ​

What Transformers.js Actually Is

Running Sentiment Analysis in the Browser

Keeping the UI Responsive with Web Workers

Tasks Beyond Text