◈ gemma
WebGPU
● loading
Inference
Chat
LAYERS
embed
→
L0
SWA
›
L1
SWA
›
L2
SWA
›
L3
SWA
›
L4
SWA
›
L5
GQA
›
L6
SWA
›
L7
SWA
›
L8
SWA
›
L9
SWA
›
L10
SWA
›
L11
GQA
›
L12
SWA
›
L13
SWA
›
L14
SWA
›
L15
SWA
›
L16
SWA
›
L17
GQA
→
lm_head
↑
Select a layer to inspect its attention weights
GQA = grouped-query attention · SWA = sliding window attention
Context
Plants create energy through a process known as
Generate next →