Getting Started
Quickstart — Ollama
Run Gemma 4 E4B locally in 3 commands using Ollama.
Quickstart — Ollama
The fastest path to a running Gemma 4 model.
Prerequisites
- macOS, Linux, or Windows (WSL2)
- 8 GB RAM minimum (16 GB recommended for E4B)
- Ollama installed
Step 1 — Install Ollama
Download Ollama
Visit ollama.ai and download the installer for your platform. On macOS:
brew install ollamaOn Linux:
curl -fsSL https://ollama.ai/install.sh | shStart the Ollama daemon
ollama serveLeave this terminal open. Ollama runs a local API server on port 11434.
Pull and run Gemma 4 E4B
ollama run gemma4:4bFirst run downloads ~2.5 GB. Subsequent runs start in under 2 seconds.
Verify it works
curl http://localhost:11434/api/generate \
-d '{"model":"gemma4:4b","prompt":"What is Gemma 4?"}'You should see a streaming JSON response.
Available Gemma 4 tags
| Tag | Model | VRAM |
|---|---|---|
gemma4:2b | E2B | 1.4 GB |
gemma4:4b | E4B | 3.2 GB |
gemma4:27b | 26B A4B | 16 GB |
Next steps
- Enable thinking mode for complex reasoning tasks
- Use the OpenAI-compatible API with your existing code
- Switch to a larger model once you've validated the setup