Quickstart — Ollama

The fastest path to a running Gemma 4 model.

Prerequisites

macOS, Linux, or Windows (WSL2)
8 GB RAM minimum (16 GB recommended for E4B)
Ollama installed

Step 1 — Install Ollama

Download Ollama

Visit ollama.ai and download the installer for your platform. On macOS:

brew install ollama

On Linux:

curl -fsSL https://ollama.ai/install.sh | sh

Start the Ollama daemon

ollama serve

Leave this terminal open. Ollama runs a local API server on port 11434.

Pull and run Gemma 4 E4B

ollama run gemma4:4b

First run downloads ~2.5 GB. Subsequent runs start in under 2 seconds.

Verify it works

curl http://localhost:11434/api/generate \
  -d '{"model":"gemma4:4b","prompt":"What is Gemma 4?"}'

You should see a streaming JSON response.

Available Gemma 4 tags

Tag	Model	VRAM
`gemma4:2b`	E2B	1.4 GB
`gemma4:4b`	E4B	3.2 GB
`gemma4:27b`	26B A4B	16 GB

The 31B model is not available through Ollama at this time. Use llama.cpp or vLLM for the full 31B variant.

Next steps

Enable thinking mode for complex reasoning tasks
Use the OpenAI-compatible API with your existing code
Switch to a larger model once you've validated the setup

Quickstart — Ollama

The fastest path to a running Gemma 4 model.

Prerequisites

macOS, Linux, or Windows (WSL2)
8 GB RAM minimum (16 GB recommended for E4B)
Ollama installed

Step 1 — Install Ollama

Download Ollama

Visit ollama.ai and download the installer for your platform. On macOS:

brew install ollama

On Linux:

curl -fsSL https://ollama.ai/install.sh | sh

Start the Ollama daemon

ollama serve

Leave this terminal open. Ollama runs a local API server on port 11434.

Pull and run Gemma 4 E4B

ollama run gemma4:4b

First run downloads ~2.5 GB. Subsequent runs start in under 2 seconds.

Verify it works

curl http://localhost:11434/api/generate \
  -d '{"model":"gemma4:4b","prompt":"What is Gemma 4?"}'

You should see a streaming JSON response.

Available Gemma 4 tags

Tag	Model	VRAM
`gemma4:2b`	E2B	1.4 GB
`gemma4:4b`	E4B	3.2 GB
`gemma4:27b`	26B A4B	16 GB

The 31B model is not available through Ollama at this time. Use llama.cpp or vLLM for the full 31B variant.

Next steps

Enable thinking mode for complex reasoning tasks
Use the OpenAI-compatible API with your existing code
Switch to a larger model once you've validated the setup

Quickstart — Ollama

Quickstart — Ollama

Prerequisites

Step 1 — Install Ollama

Verify it works

Available Gemma 4 tags

Next steps

Table of Contents

Quickstart — Ollama

Quickstart — Ollama

Prerequisites

Step 1 — Install Ollama

Verify it works

Available Gemma 4 tags

Next steps

Table of Contents