Hardware Requirements
VRAM, RAM, and storage requirements for all Gemma 4 model variants across runtimes.
Hardware Requirements
Gemma 4 runs on a wide range of hardware. Use this page to find the right model and quantization for your setup.
Quick reference
| Model | Full precision | Q8 | Q4 | CPU-only |
|---|---|---|---|---|
| E2B | 4 GB | 2 GB | 1.4 GB | ✓ (slow) |
| E4B | 8 GB | 4.5 GB | 3.2 GB | ✓ |
| 26B A4B | 52 GB | 28 GB | 16.4 GB | Not recommended |
| 31B | 62 GB | 32 GB | 24 GB | Not recommended |
GPU recommendations
Apple Silicon (M1–M4)
Unified memory. E4B runs well at 8 GB. 26B A4B needs 24 GB+ (M3 Pro/M4 Max). Use MLX for best performance.
NVIDIA RTX 3000/4000
RTX 3060 (12 GB) handles E4B comfortably. RTX 4090 (24 GB) can run 31B at Q4. Use CUDA backend.
NVIDIA RTX 3000/4000 8 GB
Use E4B at Q4 or E2B at full precision. Context window limited to ~8K for stable inference.
CPU only
E2B and E4B are usable on modern CPUs via llama.cpp. Expect 2–8 tokens/sec on 8-core machines.
Storage
Each model requires disk space for weights:
- E2B: ~1.5 GB (Q4) — ~4 GB (FP16)
- E4B: ~3 GB (Q4) — ~8 GB (FP16)
- 26B A4B: ~16 GB (Q4) — ~52 GB (FP16)
- 31B: ~24 GB (Q4) — ~62 GB (FP16)
MoE models (26B A4B) load 4B of active parameters during inference, but the full weight set must fit in storage. Disk space requirements are based on the full parameter count.
Context window and VRAM
Longer context windows consume more VRAM during inference. The minimum VRAM figures above assume short contexts (≤2K tokens). For full context:
| Model | Max context | Additional VRAM |
|---|---|---|
| E4B (32K) | 32,768 tokens | +2–4 GB |
| 26B A4B (128K) | 131,072 tokens | +8–12 GB |
| 31B (256K) | 262,144 tokens | +16–24 GB |
Platform notes
macOS: Apple Unified Memory is shared between CPU and GPU. A 16 GB M-series Mac can run E4B comfortably and 26B A4B in a pinch.
Windows: Use WSL2 for Ollama and llama.cpp. Native Windows support is available for LM Studio and llama.cpp builds.
Linux: Best performance across all runtimes. NVIDIA CUDA recommended for models above E4B.