gemma4.devgemma4.dev
  • Models
  • Run Local
  • Deploy
  • Guides
Try Gemma 4 ↗
gemma4.devgemma4.dev
Errors/Fix: No module named 'mlx_vlm.models.gemma4'

Fix: No module named 'mlx_vlm.models.gemma4'

How to fix the ModuleNotFoundError for mlx_vlm.models.gemma4 when running Gemma 4 with Apple's MLX framework.

Error Message

When attempting to run Gemma 4 with MLX on Apple Silicon, you see:

ModuleNotFoundError: No module named 'mlx_vlm.models.gemma4'

This error occurs at import time and prevents the model from loading entirely.

Why It Happens

The package mlx-vlm (mlx_vlm) is the vision-language model variant of the MLX ecosystem. It was built for multimodal models (image + text) and does not include Gemma 4 support in its text-only model registry.

Gemma 4 text models belong in mlx-lm (mlx_lm), the standard language model package for MLX. If you followed a tutorial or copied a command that used mlx_vlm to run a Gemma 4 text model, the wrong package was referenced.

Fix 1 (Preferred): Switch to mlx_lm

Uninstall the vision package and install the correct one:

pip uninstall mlx-vlm
pip install mlx-lm

Then replace any mlx_vlm calls with mlx_lm:

# Before (wrong package):
from mlx_vlm import generate
generate(model, processor, prompt)

# After (correct package):
from mlx_lm import load, generate
model, tokenizer = load('mlx-community/gemma-4-4b-it-4bit')
response = generate(model, tokenizer, prompt='Your prompt here')

Fix 2: Upgrade mlx_vlm (Vision Models Only)

If you are specifically working with Gemma 4 vision capabilities (image understanding), upgrade to the latest mlx-vlm which may include newer model support:

pip install --upgrade mlx-vlm

Note: at the time of writing, Gemma 4's vision variant support in mlx_vlm depends on your installed version. If the error persists after upgrading, use mlx_lm with the appropriate multimodal model weights instead.

Verify the Fix

After switching to mlx_lm, confirm generation works end to end:

mlx_lm.generate --model mlx-community/gemma-4-4b-it-4bit --prompt "Test"

You should see token output without any import errors. The first run downloads the model weights from Hugging Face if they are not already cached.

Related

  • Running Gemma 4 with MLX
gemma4.devgemma4.dev

Run, deploy, and debug Gemma 4 models. Built for developers who move fast.

GitHubGitHubTwitterX (Twitter)Email
Models
  • Gemma 4 E2B
  • Gemma 4 E4B
  • Gemma 4 26B
  • Gemma 4 31B
  • Compare Models
Run Local
  • Ollama
  • Hugging Face
  • GGUF
  • LM Studio
  • llama.cpp
Deploy
  • vLLM
  • Gemini API
  • Vertex AI
  • Cloud Run
Guides & Help
  • Thinking Mode
  • Prompt Formatting
  • Function Calling
  • Error Fixes
Š 2026 gemma4.dev All Rights Reserved.