Gemma 3 in your browser

Full inference for Gemma 3 270M-Instruct (FP16), running entirely on-device via WebGPU. Optimized with KV-caching and JIT kernel fusion powered by jax-js.

Run a forward pass and inspect attention weights layer by layer. Select any attention head to see how the model attends across tokens.

Open →

Chat

Talk to the model

Interact with Gemma 3 through a chat interface. Responses stream token-by-token, generated entirely on your machine.

Open →