gemma4.devgemma4.dev
MkSaaS文档
gemma4.devgemma4.dev
MkSaaS文档
首页Gemma 4 Developer Hub
Model Reference
X (Twitter)
Models

Model Reference

All four Gemma 4 variants — architecture, specs, and recommended use cases.

Model Reference

Gemma 4 ships in four weight configurations. Two are dense models (E2B, E4B) and two are mixture-of-experts architectures (26B A4B, 31B).

Comparison table

ModelArchitectureParametersActive ParamsContextVRAM (Q4)Best for
E2BDense2B2B8K1.4 GBMobile, embedded
E4BDense4B4B32K3.2 GBCoding, chat
26B A4BMoE26B4B128K16.4 GBLong context, writing
31BDense31B31B256K24 GBResearch, agentic

Choosing a model

Gemma 4 E2B

2B dense. Runs on any hardware. Edge devices, mobile, CPU-only inference.

Gemma 4 E4B

4B dense. The most popular choice. Great for code, chat, and everyday tasks.

Gemma 4 26B A4B

26B MoE, 4B active. 128K context. Technical writing, RAG, long documents.

Gemma 4 31B

31B dense. 256K context. Full reasoning capability. Production agentic systems.

MoE vs Dense

The 26B A4B is a Mixture of Experts model. It has 26B total parameters but routes each token through only ~4B active parameters per forward pass. This means:

  • Inference cost similar to a 4B model
  • Capability closer to a 13B+ dense model
  • Storage requires the full 26B weights on disk

The 31B model is dense — all 31B parameters are active on every forward pass, giving it the highest raw capability at the cost of VRAM.

Hardware Requirements

VRAM, RAM, and storage requirements for all Gemma 4 model variants across runtimes.

Concepts

Core concepts — thinking mode, multimodal, quantization, and prompt formatting.

目录

Model Reference
Comparison table
Choosing a model
MoE vs Dense