Run Claude Code for FREE with Local Gemma 4

Published: April 15, 2026 | Tags: claude-code, gemma-4, lm-studio, local-ai, tutorial

Table of Contents

The Problem
What You Need
Installation Steps
Critical: Context Window Setting
Demo: Build a Chrome Dinosaur Game
Pro Tips
Resources

The Problem

Claude Code's API costs can hit $200/month. Free credits keep getting cut. That's expensive for just coding assistance.

The solution: Run Claude Code locally with Google's Gemma 4 model. Zero API costs, completely on your machine.

Result: Free, fast, and actually works for real coding tasks.

What You Need

Mac with 16GB RAM (or similar Linux setup)
Claude Code installed
LM Studio
Google Gemma 4 E4B model (only 6GB!)

Installation Steps

Step 1: Install Claude Code

npm install -g @anthropic-ai/claude-code

Step 2: Install LM Studio

Download from lmstudio.ai and install.

Step 3: Verify LM Studio CLI

lms

If command not found, close and reopen your terminal.

Step 4: Download Gemma 4 Model

Model: Gemma 4 E4B
Size: ~6GB (vs 16-19GB for other models)
Runs smoothly on 16GB RAM Macs
Download time: a few minutes

Step 5: Start Local Server

lms server start --port 1234

You'll see a success message when started.

Step 6: Configure Claude Code (3 lines)

export CLAUDE_API_BASE_URL=http://localhost:1234/v1
export ANTHROPIC_API_KEY=anything

# Add to ~/.zshrc for permanent setup

The API key can be anything since you're talking to your own machine.

Step 7: Launch Claude Code

claude

Check the top-left corner - it should show Gemma 4, not Sonnet or Opus.

⚠️ Critical Setting: Context Window

The default context window is too small! This will cause Claude Code to freeze when using tools on complex tasks. The fix is simple but essential.

Increase context length before starting work:

lms unload all
lms load --context-length 40960

Why 40960? This gives enough tokens for most tasks. Without this, tool calling fails because the context overflows. This single setting makes Claude Code 10x faster.

Demo: Build a Chrome Dinosaur Game

Test run with a real project - building Chrome's offline dinosaur game from scratch:

Enter a detailed prompt in one shot
Let Gemma 4 generate the complete game
Result: Fully playable game with score, collision detection, restart button

Outcome: One prompt → complete working game → zero API costs

Pro Tips

One-shot prompts: Small models accumulate context with each round. Write everything you need in the first prompt.
Be specific: Include all requirements upfront - features, style, behavior.
Context matters: Load the model with higher context before complex tasks.
Speed vs capability: 4B models are fast but have limits. Know when to use cloud for heavy tasks.

Resources

Originally from a Chinese tutorial video. Translated and summarized for English audience.