Run Claude Opus 4.7, Gemini 3, and Codex Inside Google Antigravity — Without an API Key.
Most vibe coders are paying for API credits they donʼt need. Anthropic, OpenAI, and Google all already let you run their best frontier models inside Antigravityʼs editor and terminal — using only the subscriptions you already have. This class gives you paste-ready setup for every path, the workflow split that stretches each subscription the furthest, and the validation gates that prove every model is actually live before you commit to a sprint.
Quick Answer — Can I Run Claude Opus 4.7 in Antigravity Without an API Key?
Yes. Antigravityʼs native model picker does not yet include Claude Opus 4.7, but the Claude Code CLI runs perfectly inside Antigravityʼs integrated terminal and authenticates against your Claude.ai Max plan via OAuth — zero API charges, zero Anthropic Console billing, full 1M context auto-upgraded for Max subscribers. The same pattern works for OpenAI Codex (ChatGPT Plus/Pro auth) and Gemini CLI (personal Google account). Antigravityʼs free preview gives you Gemini 3.1 Pro, Claude Sonnet 4.6, Claude Opus 4.6 Thinking, and GPT-OSS 120B in the native picker for free. The combined stack costs $0 beyond subscriptions you already pay.
Key Takeaways — Why This Setup Beats Every Single-Model IDE
- Zero net cost: Antigravity preview is free, Claude/ChatGPT/Google subs you already have cover the rest.
- Four model paths in one IDE: Opus 4.7 (Claude Code), Gemini 3 (native + CLI), GPT-5.x (Codex), local Ollama 14B–32B (Cline/Continue).
- Subscription auth verified: Claude Max
/statusshows “Max plan” not API credits; Codex auto-creates API key via ChatGPT OAuth. - 1M context on Opus 4.7 for Max plans — auto-upgrade, no flag, no extra config.
- Profile isolation pattern using
CLAUDE_CONFIG_DIRlets you run Max-paid Claude Code alongside an existing Ollama Cloud or Console-paid setup — no auth leaks. - Cross-CLI MCP reuse: Shopify Dev MCP, WordPress.com, GitHub MCP register once and work in Claude Code, Codex CLI, and Gemini CLI.
- Workflow split that stretches Max 5x: use free Gemini 3 native for planning, paid Opus 4.7 in terminal for implementation. Real-world burn rate cut by 60–80%.
Why Antigravity Is the Vibe Coderʼs Power Multiplier
Antigravity launched November 18, 2025 alongside Gemini 3, built by the former Windsurf team Google brought on through a $2.4B licensing deal. It is a heavily modified VS Code fork, but the architecture flips the script: the agent is first-class, the editor is a viewport into what the agent is doing. The Manager view orchestrates parallel agents across editor, integrated terminal, and an embedded Chrome browser. Thatʼs the structural advantage no other free IDE has.
The Three Things That Make This Work
One: Antigravity is a VS Code fork, so its integrated terminal is a real terminal — PowerShell, zsh, bash — with full filesystem access and PATH inheritance. Anything you can run from a normal terminal runs here, including Claude Code, Codex CLI, Gemini CLI, and local Ollama-aware tools. The agent panel and the terminal pane sit in the same window.
Two: Antigravityʼs free preview gives you Gemini 3.1 Pro (high and low variants), Gemini 3 Flash, Claude Sonnet 4.6, Claude Sonnet 4.6 Thinking, Claude Opus 4.6 Thinking, and GPT-OSS 120B in the native picker. You can use these for ambient agentic work without burning any subscription you pay for elsewhere.
Three: Subscription-authenticated CLIs (Claude Code, Codex, Gemini CLI) authenticate directly against the upstream provider — not through Antigravity. They draw against your Claude Max, ChatGPT Pro, or Google AI Pro allocation, not Antigravityʼs free quota. This is the leverage: you preserve the free tier for ambient work while running heavy implementation work on subscription-paid frontier models.
How Antigravity Stacks Against the Competition
| IDE | Cost Floor | Native Multi-Model | Free Frontier Access | Parallel Agents | Embedded Browser | Subscription CLI Compatible |
|---|---|---|---|---|---|---|
| Antigravity | Free (preview) | ✓ Gemini 3.1 Pro, Sonnet 4.6, Opus 4.6, GPT-OSS | ✓ Opus 4.6 Thinking included free | ✓ Mission Control | ✓ Native Chrome integration | ✓ Claude Code, Codex, Gemini CLI |
| Cursor 2.0 | $20–$200/mo | ✓ Claude, GPT, Gemini, Composer | – Trial only | ✓ Up to 8 | – No | ✓ Yes |
| Windsurf | $15–$60/mo | ✓ Codeium + frontier | – Limited | ✓ Yes | – No | ✓ Yes |
| VS Code | Free editor | – Via extensions only | – No native AI | – Manual | – No | ✓ Yes (manual) |
| Zed | Free | ✓ Recent | – Limited | – Sequential | – No | ✓ Yes |
The math is simple: Antigravity is the only major IDE that costs $0 at the preview tier and supports running all three subscription-authenticated frontier CLIs in parallel. Cursor charges $20+/mo just for the editor. Windsurf charges $15+/mo. VS Code is free but you have to wire everything yourself. Zed is fast but new. Antigravity gives you Mission Control plus the option to layer in Opus 4.7, Codex, and Gemini CLI on top.
Claude Opus 4.7 via Your Max Plan
Opus 4.7 shipped April 14, 2026 with same-as-4.6 pricing on the API ($5 input / $25 output per million tokens), measurable gains on long-running software engineering tasks, higher-resolution vision, and an xhigh effort level sitting between high and max. Max plan subscribers get 1M context auto-upgraded, plus Auto mode and reduced permission prompts. None of that requires API billing — just your existing Claude.ai Max subscription.
Claude Code · Subscription Auth · Opus 4.7 · 1M Context
Run Anthropicʼs strongest model inside Antigravityʼs integrated terminal, billed entirely against the Max plan you already pay for.
-
Install or update Claude Code to v2.1.111+
Opus 4.7 requires Claude Code v2.1.111 or later. If youʼve never installed it, the npm path works on every OS Antigravity supports. If you have it already, run
claude updateand restart your terminal.Install Claude Code (any OS) shell# Fresh install (any OS with Node.js 18+) npm install -g @anthropic-ai/claude-code # If already installed, upgrade in place claude update # Verify version (must be 2.1.111 or higher) claude --version
-
Audit existing Claude Code env vars (do not delete)
Before adding the new profile, identify any
ANTHROPIC_BASE_URL,ANTHROPIC_API_KEY,ANTHROPIC_AUTH_TOKEN, orCLAUDE_CONFIG_DIRvalues that already exist. They might belong to a working setup — like an Ollama Cloud or third-party proxy — and should be preserved.Audit Windows env vars (PowerShell) powershell# Print existing values at all three Windows scopes Get-Content $PROFILE $env:ANTHROPIC_API_KEY $env:ANTHROPIC_BASE_URL [Environment]::GetEnvironmentVariable("ANTHROPIC_API_KEY", "User") [Environment]::GetEnvironmentVariable("ANTHROPIC_API_KEY", "Machine") # Note any values found. DO NOT clear them. # The launcher in step 3 will null them inside its own scope only.
Audit existing env (Mac/Linux) bash# Print existing values from your shell profile grep -E "ANTHROPIC|CLAUDE_CONFIG" ~/.zshrc ~/.bashrc ~/.profile 2>/dev/null echo $ANTHROPIC_API_KEY echo $ANTHROPIC_BASE_URL echo $CLAUDE_CONFIG_DIR # Note what's set. The launcher will isolate, not delete.
-
Create the isolated
claude-maxlauncherThe launcher sets
CLAUDE_CONFIG_DIRto a fresh directory (~/.claude-max) and explicitly nulls any leakingANTHROPIC_*env vars in function scope. This is the linchpin: your defaultclaudecommand stays bound to whatever it was before (Ollama Cloud, Console PAYG, whatever);claude-maxruns strictly against your Max subscription with no auth bleed.Add launcher (Windows PowerShell) powershell# Create the isolated profile dir New-Item -ItemType Directory -Path "$env:USERPROFILE\.claude-max" -Force # Append launcher function to your $PROFILE Add-Content $PROFILE @" function claude-max { `$env:CLAUDE_CONFIG_DIR = `"`$env:USERPROFILE\.claude-max`" `$env:ANTHROPIC_BASE_URL = `$null `$env:ANTHROPIC_API_KEY = `$null `$env:ANTHROPIC_AUTH_TOKEN = `$null claude @args } "@ # Reload profile in current shell . $PROFILE # Verify launcher exists Get-Command claude-max
Add launcher (Mac/Linux zsh or bash) bash# Create the isolated profile dir mkdir -p ~/.claude-max # Append launcher function to ~/.zshrc (or ~/.bashrc) cat >> ~/.zshrc << 'EOF' # DDS Vibe Academy — Claude Max isolated launcher claude-max() { CLAUDE_CONFIG_DIR="$HOME/.claude-max" \ ANTHROPIC_BASE_URL="" \ ANTHROPIC_API_KEY="" \ ANTHROPIC_AUTH_TOKEN="" \ command claude "$@" } EOF # Reload source ~/.zshrc # Verify type claude-max
Why the env nulling matters: if Claude Code sees anANTHROPIC_API_KEYin the environment, it bills against API credits at PAYG rates — even if you intended to use your Max plan. Nulling these vars in function scope guarantees the session falls back to subscription auth. Verify with/statusin step 5. -
First-time login with your Max account
Run the launcher with no arguments. If thereʼs no cached session, Claude Code prompts for login. Use the
/loginslash command and complete the browser OAuth. Sign in with the Max account, not a Console (PAYG) account. Decline any API credit offers that appear.Sign in via OAuth shell# Launch isolated Max profile claude-max # Inside the session, log in /login # Browser opens. Sign in with claude.ai Max account. # Decline API credit offers if prompted. # Browser returns "Login complete" — close the tab.
-
Validate subscription auth and select Opus 4.7
This step is non-negotiable. If
/statusshows “API credits” instead of “Max plan,” an env var is leaking and youʼre about to be billed at PAYG rates. Stop and re-check step 2.Verify auth + pick Opus 4.7 claude-code# Inside claude-max session /status # Expected: Login method: Claude Max account # Expected: Model: Default Opus 4.7 with 1M context # Must NOT show: API credits, Console, Pay-as-you-go /model # Picker opens. Select claude-opus-4-7 if not default. # Stress test the model identity itself Reply with the exact string OPUS47_ACTIVE and nothing else. # Expected response: OPUS47_ACTIVE
Validation matrix:/statusshows “Max plan” → auth correct/statusshows “API credits” → env leak, re-audit step 2- Model field reads “Opus 4.7 with 1M context” → Max upgrade applied
- Model field reads “200K context” → not on Max tier or 1M flag disabled
- Test prompt returns
OPUS47_ACTIVE→ selection confirmed
-
Run inside Antigravityʼs integrated terminal
Open Antigravity. Open any project folder. Open the integrated terminal with
Ctrl+`(backtick) or View → Terminal. Runclaude-max. The session inherits your subscription auth from~/.claude-max.First real-task validation claude-code# In Antigravity terminal, project root open claude-max # First real prompt — proves filesystem access + Opus 4.7 List the files in this directory grouped by type. Then tell me which model you are and whether you have 1M context available.
Cosmetic warning you can ignore: Antigravityʼs Claude Code extension auto-installer hits an “Error installing VS Code extension: 127: Command 'code.cmd' not found” error on Windows. This is because Antigravity ships its own CLI binary, not the standardcode.cmd. The terminal-based Claude Code workflow is fully functional regardless. The extension only adds inline diff previews, click-to-jump-to-source, and ambient activity indicators in the editor pane — useful but optional. Skip it for now.
Optional: Install the Claude Code Extension Inside Antigravity
Antigravity uses OpenVSX by default. The Claude Code extension (publisher Anthropic/claude-code) is on OpenVSX but lags behind the VS Code Marketplace version. If you want the extension polish, two paths:
Path A — Switch Antigravityʼs marketplace (preview-tier hack, may violate Microsoft ToS): Open Settings (Ctrl+,), go to Editor, set Marketplace Item URL to https://marketplace.visualstudio.com/items and Marketplace Gallery URL to https://marketplace.visualstudio.com/_apis/public/gallery. Restart Antigravity. Search for “Claude Code” in Extensions and install the official Anthropic version. Some users report “Failed to fetch” errors on this URL combination — if you hit it, fall back to Path B.
Path B — Manual VSIX install via Antigravityʼs CLI: Download anthropic.claude-code-X.X.X.vsix from the GitHub releases page. Then on Mac run /Applications/Antigravity.app/Contents/Resources/app/bin/antigravity --install-extension anthropic.claude-code-X.X.X.vsix. On Windows the binary is named antigravity in the Resources/app/bin folder. The extension installs into Antigravityʼs OpenVSX-managed registry without needing a marketplace switch.
Gemini 3 — Native Picker Plus CLI for Headless Work
Gemini 3 is the model Antigravity is built around. The native model picker exposes Gemini 3.1 Pro (high and low effort variants) and Gemini 3 Flash already, and itʼs free during preview — no install, no auth setup, no cost. The optional second layer is Gemini CLI, which gives you the same model in your terminal for headless scripting, automated workflows, and side-by-side use with Claude Code or Codex.
Gemini 3 Pro · Native + CLI · Free Tier or AI Pro Subscription
Use the native picker for ambient agentic work. Layer Gemini CLI for terminal scripting and parallel workflows.
-
Native picker — nothing to install
If youʼre already signed into Antigravity with a personal Gmail, Gemini 3.1 Pro is in the picker right now. Click the model dropdown in the agent panel (right side of the window) and select your preferred variant.
Variant guide: Gemini 3.1 Pro (High) for complex reasoning and multi-step planning. Gemini 3.1 Pro (Low) for fast iteration on simpler tasks — same model, less thinking time. Gemini 3 Flash for the highest throughput at the cost of depth. Use Flash for routine scaffolding, Pro High for architecture and difficult bugs. -
Install Gemini CLI for terminal scripting
Gemini CLI is open-source (Apache 2.0) and ships as an npm package. Free tier OAuth gives you 60 requests per minute and 1,000 requests per day with a personal Google account — more than enough for most vibe-coding sessions.
Install Gemini CLI globally shell# Requires Node.js 18+ (20 recommended) node --version # Install globally (latest stable v0.32.x as of March 2026) npm install -g @google/gemini-cli # Verify install gemini --version
-
Authenticate with personal Google account
First run prompts for theme then auth. Choose Personal Google Account for free OAuth tier. The CLI opens a browser for sign-in. Once authenticated, the session persists in
~/.gemini/settings.jsonand you wonʼt be asked again.Sign in to Gemini CLI shell# Run from any directory gemini # Step 1: pick a theme # Step 2: select "Personal Google Account" auth # Browser opens, sign in, accept terms # CLI returns to terminal with prompt > # Quick test > What model are you? # /exit to leave session /exit
-
Boost limits with Google AI Pro (optional)
If you hit the 1,000 requests-per-day cap on heavy days, upgrade to Google AI Pro ($19.99/mo) or Ultra. The higher limits are shared between Gemini CLI and the Gemini Code Assist agent mode in IDEs — including Antigravityʼs native picker. So one subscription stretches across both surfaces.
Useful slash commands gemini-cli# Inside gemini session /help # list all commands /stats # tokens used this session (API key users only) /mcp # show connected MCP servers /clear # reset conversation history @filename # include file contents in next prompt /exit # quit # Headless scripting (one-shot, no follow-ups) gemini -p "What is the gcloud command to deploy to Cloud Run?" # Structured JSON output gemini -p "Explain this codebase architecture" --output-format json
-
Validation
What to check:gemini --versionreturns 0.32.x or later → install OK- First-run OAuth completes → auth OK
- Test prompt streams response → model live
/statsshows token counts (Pro/API users) → quota tracked- OAuth users:
/statsshows session token count only (no cached token info — not supported via Gemini Code Assist API for personal accounts)
OpenAI Codex via Your ChatGPT Plus or Pro Plan
Codex is OpenAIʼs terminal-based coding agent — the peer to Claude Code in the OpenAI ecosystem. Itʼs included with ChatGPT Plus, Pro, Business, Edu, and Enterprise plans, and currently bundled with Free and Go plans for a limited time. Pro users get 2× rate limits as part of an ongoing promotion. Sign in once with ChatGPT — the CLI auto-creates a linked API key behind the scenes — and Codex runs in any terminal, including Antigravityʼs.
Codex CLI · ChatGPT Subscription Auth · GPT-5.x
Run OpenAIʼs frontier coding model on your existing ChatGPT subscription — no manual API key creation, no separate billing.
-
Install Codex CLI
Codex CLI is published as
@openai/codexon npm. The npm package wraps a Rust binary, so installation handles platform-specific binary fetching automatically.Install Codex CLI shell# Requires Node.js 18+ npm install -g @openai/codex # Verify install codex --version # Show available subcommands codex --help
-
Sign in with ChatGPT (no API key needed)
Run
codex loginwith no flags. The CLI opens a browser for the ChatGPT OAuth flow. After you sign in, OpenAI auto-generates an API key linked to your ChatGPT account and configures the CLI for you. Plus subscribers get $5 in promotional API credits one-time; Pro subscribers get $50. Credits expire 30 days after grant.OAuth sign-in flow shell# Browser-based OAuth (default) codex login # Click "Sign in with ChatGPT" # Select your API organization in the OpenAI dashboard # Browser auto-creates API key, returns to CLI # Confirm codex login status # Exit code 0 means logged in # Alternative: device-code flow (no browser) codex login --device-code # Alternative: manual API key (for CI/CD only) printenv OPENAI_API_KEY | codex login --with-api-key
Eligibility for promotional credits: Subscription must be older than 7 days, you must not have already redeemed this offer, and the API organization must have a valid default payment method. The CLI walks you through this on first sign-in. -
Configure model + sandbox in
~/.codex/config.tomlCodex stores configuration in TOML format (not JSON like Claude Code). Pin the model, set approval policy, and configure the sandbox mode here.
~/.codex/config.toml — recommended baseline toml# Pin model (gpt-5.3-codex is the dedicated coding variant as of Q1 2026) model = "gpt-5.3-codex" # Approval mode: "on-request" prompts for risky commands # Alternatives: "untrusted", "auto", "never" (read full docs) approval_policy = "on-request" # Sandbox: workspace-write blocks network + outside-workspace writes # Alternatives: "read-only", "workspace-write", "danger-full-access" sandbox_mode = "workspace-write" # Enable network inside the workspace sandbox if your tasks need it [sandbox_workspace_write] network_access = false # Where credentials are stored cli_auth_credentials_store = "keyring" # or "file" or "auto"
-
Run Codex inside Antigravityʼs terminal
Once authenticated,
codexwith no subcommand launches the interactive TUI. Antigravityʼs integrated terminal works identically to your system terminal — nothing special required.Daily Codex usage shell# Interactive TUI (default mode) codex # Headless one-shot codex exec --full-auto "Run unit tests and fix failures" # Read-only audit (no writes, no commands) codex -s read-only "Audit this codebase for security issues" # Override sandbox network on demand (e.g. for npm install) codex -c 'sandbox_workspace_write.network_access=true' "Install dependencies" # CI/automation only (skip approvals + sandbox) codex --dangerously-bypass-approvals-and-sandbox "Deploy"
-
MCP for Codex (TOML format)
Codex supports MCP servers, but the config syntax differs from Claude Code. Use
codex mcp addfor CLI registration, or edit~/.codex/config.tomldirectly with TOML[mcp_servers.<name>]blocks.Add Shopify Dev MCP to Codex toml# In ~/.codex/config.toml — add this block [mcp_servers.shopify-dev-mcp] command = "shopify-dev-mcp" args = [] # Or via CLI codex mcp add shopify-dev-mcp shopify-dev-mcp # List configured servers codex mcp list # Add an HTTP-based MCP with bearer auth codex mcp add my-server https://example.com/mcp \ --remote-auth-token-env MY_SERVER_TOKEN
Codex validation gates:codex login statusexits 0 → auth cachedcodex --versionreturns latest → install OK- Test prompt completes → model live
codex mcp listshows registered servers → MCP layer wired- Sandbox enforces approval prompts unless explicitly bypassed → safety on
Local Models — Ollama via Cline or Continue.dev
Antigravityʼs native model picker does not support custom OpenAI-compatible endpoints — thereʼs no “Add Custom Model” button and no localhost configuration. But because Antigravity is a VS Code fork, you can install Cline or Continue.dev via the OpenVSX extension marketplace and route through them. Both extensions natively support Ollama on http://localhost:11434/v1. This is also how a terminal-based path works: any tool that already supports OpenAI-compatible endpoints (Claude Code with Ollama Cloud env vars, Aider, etc.) runs in Antigravityʼs integrated terminal exactly as it does in any other shell.
Ollama Local · Cline or Continue.dev · Sovereign Inference
Run 14B–32B coding models on your own GPU for sensitive code, offline work, or zero-cloud sprints.
-
Install Ollama and pull a coding model
Ollama exposes an OpenAI-compatible endpoint at
http://localhost:11434/v1. For an RTX 3060 12GB or comparable hardware,qwen2.5-coder:14bis the sweet spot for chat, withqwen2.5-coder:1.5bfor low-latency autocomplete.Install Ollama + pull models shell# macOS brew install ollama brew services start ollama # Linux (and WSL) curl -fsSL https://ollama.ai/install.sh | sh # Windows: download installer from ollama.com # After install, Ollama runs as a service on port 11434 # Pull a chat model (14B fits comfortably in 12GB VRAM) ollama pull qwen2.5-coder:14b # Pull a fast autocomplete model (1.5B) ollama pull qwen2.5-coder:1.5b # Pull embeddings for codebase context ollama pull nomic-embed-text # Verify endpoint is live curl http://localhost:11434/api/version
Hardware reality check: 7B–8B models run on 8GB VRAM. 14B models need 12GB+ comfortably. 32B models need 24GB (RTX 4090, RTX 3090, or Apple Silicon with 48GB+ unified memory). If youʼre on a thin laptop, your local options are smaller models likeqwen2.5-coder:1.5bfor autocomplete-only, with cloud models doing the heavy lifting. -
Install Cline OR Continue.dev in Antigravity
Both extensions are on OpenVSX and install directly through Antigravityʼs Extensions panel (
Ctrl+Shift+X). Pick one based on your workflow: Cline is a single agent panel with strong tool-use and MCP support; Continue.dev exposes chat, autocomplete, and embeddings as separate roles.Install via Antigravity Extensions panel ide-action# In Antigravity: # 1. Press Ctrl+Shift+X (Cmd+Shift+X on Mac) to open Extensions # 2. Search "Cline" — install the saoudrizwan publisher version # OR # 2. Search "Continue" — install the Continue.dev publisher version # 3. Reload Antigravity if prompted # 4. Click the new extension icon in the activity bar
-
Configure Continue.dev for Ollama (recommended for full role split)
Continue.dev uses a YAML config at
~/.continue/config.yaml. The config below splits chat, autocomplete, and embeddings across three different local models for optimal speed-vs-quality tradeoff.~/.continue/config.yaml — Ollama three-role setup yamlname: DDS Local Sovereign Config version: 1.0.0 schema: v1 models: # Chat + edit + apply (heavy lifting model) - name: Local Chat — Qwen 2.5 Coder 14B provider: ollama model: qwen2.5-coder:14b apiBase: http://localhost:11434 roles: - chat - edit - apply defaultCompletionOptions: temperature: 0.3 contextLength: 32768 # Tab autocomplete (fast small model) - name: Local Autocomplete — Qwen 1.5B provider: ollama model: qwen2.5-coder:1.5b apiBase: http://localhost:11434 roles: - autocomplete autocompleteOptions: debounceDelay: 250 maxPromptTokens: 1024 multilineCompletions: auto # Embeddings for @codebase - name: Local Embeddings provider: ollama model: nomic-embed-text apiBase: http://localhost:11434 roles: - embed context: - provider: code - provider: docs - provider: diff - provider: terminal - provider: folder - provider: codebase rules: - Never reproduce DDS certifications inaccurately. The five are GOTS, GRS, OCS, PETA-Approved Vegan, Fair Trade. - Always verify Shopify schema range steps satisfy (max-min)/step >= 3 and (default-min) % step == 0. - Output full files, not snippets, unless explicitly asked otherwise.
-
Configure Cline for Ollama (alternative)
Cline uses a settings UI rather than a YAML file. Open Cline settings, choose “Ollama” as the API provider, and point at your local endpoint. Important Cline-specific tip: Clineʼs system prompts are very long; the default Ollama context window (32K) often isnʼt enough. Either enable Compact Prompt (Cline Settings → Features → Use Compact Prompt) or create a custom Modelfile with a larger
num_ctx.Cline + Ollama with extended context Modelfile shell# Create a Modelfile with extended context cat > qwen-cline-modelfile << 'EOF' FROM qwen2.5-coder:14b PARAMETER num_ctx 65536 EOF # Build the extended-context model ollama create qwen-cline -f qwen-cline-modelfile # In Cline settings (UI) # API Provider: Ollama # Base URL: http://localhost:11434 # Model ID: qwen-cline # Save # For better performance, enable Compact Prompt # Cline Settings → Features → Use Compact Prompt → ON # Reduces system prompt size by ~90%
-
Tune Ollama for long agentic sessions
Two server-side tweaks make a major difference once youʼre running real workloads. The default 5-minute model unload aggressively recovers VRAM but costs you a cold-load every time you come back from coffee. And streaming output lets you tell the difference between “model is thinking” and “model is hung.”
Ollama performance tuning (Linux/Mac) bash# Add to ~/.zshrc or ~/.bashrc to persist export OLLAMA_KEEP_ALIVE=24h # keep model loaded all day export OLLAMA_FLASH_ATTENTION=1 # enable flash attention export OLLAMA_KV_CACHE_TYPE=q8_0 # 8-bit KV cache reduces VRAM usage export OLLAMA_CONTEXT_LENGTH=8192 # or higher if VRAM permits # Restart Ollama service to apply brew services restart ollama # macOS # or: systemctl --user restart ollama # Linux systemd
Ollama performance tuning (Windows PowerShell) powershell# Set persistent user env vars [Environment]::SetEnvironmentVariable("OLLAMA_KEEP_ALIVE", "24h", "User") [Environment]::SetEnvironmentVariable("OLLAMA_FLASH_ATTENTION", "1", "User") [Environment]::SetEnvironmentVariable("OLLAMA_KV_CACHE_TYPE", "q8_0", "User") [Environment]::SetEnvironmentVariable("OLLAMA_CONTEXT_LENGTH", "8192", "User") # Restart Ollama (reopen the Ollama app from system tray) # Or kill and restart from CLI: Get-Process ollama -ErrorAction SilentlyContinue | Stop-Process -Force ollama serve
Local-path validation:curl http://localhost:11434/api/versionreturns version JSON → Ollama runningollama listshows pulled models → weights downloaded- Cline or Continue.dev panel opens in Antigravity sidebar → extension active
- First chat returns response → route working end-to-end
- Continue:
@codebaseworks → embeddings model wired
The MCP Layer — Connect Once, Use Everywhere
Model Context Protocol (MCP) is the open standard that lets AI clients talk to external tools through structured JSON-RPC. The same MCP servers work in Claude Code, Codex CLI, Gemini CLI, and Antigravityʼs Cline/Continue extensions. Configure once on each client; reuse across every model. The four most useful for vibe coders are Shopify Dev MCP, WordPress.com, GitHub MCP, and a Figma MCP.
Shopify Dev MCP — Liquid + GraphQL Validation
Shopify open-sourced their AI Toolkit on April 9, 2026, including the Dev MCP server. It exposes seven tools covering Admin API documentation search, GraphQL schema validation, Liquid template validation, and theme rule enforcement. No store auth required for docs and schema access — only store mutations need authentication.
# Step 1: install globally to bypass PowerShell -y flag parsing issue npm install -g @shopify/dev-mcp # Step 2: confirm binary on PATH Get-Command shopify-dev-mcp # Step 3: register with Claude Code (Max profile) claude-max mcp add --transport stdio shopify-dev-mcp shopify-dev-mcp # Step 4: register with Codex (TOML config) codex mcp add shopify-dev-mcp shopify-dev-mcp # Step 5: verify in each CLI claude-max mcp list codex mcp list
claude mcp add ... -- npx -y @shopify/dev-mcp@latest form fails on Windows PowerShell because the -- separator gets consumed by the shell before reaching Claudeʼs parser, and -y gets interpreted as a Claude flag. Installing globally then registering by binary name skips the issue entirely. The add-json alternative (passing config as a JSON string) is also unreliable because PowerShell mangles inner double-quotes when passing native commands.
WordPress.com Connector — Live Site Context
If you have a paid WordPress.com plan, you can connect your site to Claude (and downstream to Antigravity terminal sessions) through claude.aiʼs Connectors directory using OAuth 2.1. The connection routes through your Max account auth, so it appears in claude-max /mcp output automatically. Default scope is read-only; write tools (create_post, upload_media) opt in individually.
# Step 1: enable MCP on WordPress.com side # Log in → My WP.com Account → MCP → Toggle ON # Leave write tools DISABLED initially; read tools default ON # Step 2: add connector in claude.ai # claude.ai → Settings → Connectors → Browse # Search "WordPress.com" → click + # OAuth 2.1 flow → sign into WP.com → authorize # Step 3: verify in Claude Code claude-max /mcp # Should show: claude.ai WordPress.com - Connected # Step 4: validate with real query # Use the WordPress.com MCP to fetch site info, # then list the 5 most recent posts.
GitHub MCP — Repo + Issues + PR Context
GitHubʼs official MCP server connects Claude Code (and other MCP clients) directly to GitHub repos, issues, and pull requests. Claude Code can pull the diff for a PR, fetch issue text, search code, and (with write scope) create branches and commits. Useful when youʼre working across repos and donʼt want to copy-paste issue text.
# Anthropic ships a curated remote MCP directory in claude.ai # claude.ai → Settings → Connectors → Browse → "GitHub" # OAuth flow grants scoped access # Verify in claude-max claude-max /mcp # Try a real query # > Use the GitHub MCP to fetch issue #42 from my-org/my-repo # > and propose a fix based on the description.
Other MCPs Worth Adding
- Figma MCP — design-to-code workflows; reads frame metadata, exports asset URLs
- Notion MCP — writes generated docs back to your workspace
- Slack MCP — posts agent status, reads channel history for context
- Postgres / Supabase MCP — read-only schema introspection for query generation
- Sentry MCP — pulls error logs for fix-up sessions
All of these are available either via Anthropicʼs claude.ai Connectors directory (preferred when available, since auth is OAuth) or via npm-installable stdio servers (use the same install-globally-first pattern as Shopify Dev MCP for Windows reliability).
The Workflow Split That Stretches Every Subscription
Owning four model paths is leverage, not a problem. The mistake is using one model for everything. The pattern that wins: plan with cheap, implement with expensive, validate with local, monitor with native. Done correctly, this single approach can stretch a Claude Max 5x subscription across two to three times the work that single-model usage delivers.
The Four-Lane Discipline
Lane 1 — Planning: Gemini 3.1 Pro (native picker, free). Antigravityʼs free preview of Gemini 3 Pro is the right tool for breaking work into tasks, drafting an architecture, generating directory structures, and outlining test plans. The output is plans, not code. You burn zero subscription tokens. Reach for Gemini 3 Pro High effort variant when the planning is complex.
Lane 2 — Implementation: Claude Opus 4.7 (Max plan, terminal). Once a plan is in hand, switch to claude-max in the Antigravity terminal. Opus 4.7 with 1M context handles the actual code: long files, refactors that touch many modules, debugging that requires holding the whole repo in context. This is where you spend your premium budget, and Opus 4.7 returns the most value per token of any frontier model on long-context coding tasks.
Lane 3 — Validation: Local Ollama (Cline or Continue, zero cost). After Opus 4.7 generates a change, route the validation pass to your local 14B coder. “Read this diff and tell me if anything looks wrong” is a perfect job for a 14B model running on your own GPU. Free. Offline. Private. Catches the obvious mistakes before you waste another Opus turn.
Lane 4 — Specialized review: Codex (ChatGPT subscription). When a task is squarely OpenAIʼs strength — CI/CD scripts, GitHub Actions, OpenAI API integrations, certain Python ecosystems — Codex returns better results than Claude or Gemini. ChatGPT Pro subscribers get 2× rate limits as a current promotion. Codex also exposes a real PR-review mode through GitHub integration that none of the others match.
Token Economy on Claude Max 5x
The Max 5x plan ($200/month) gives you 5× the usage of the Pro tier across all surfaces — claude.ai, Claude Code, the API via OAuth, and connector traffic. Practical reality: a developer running Claude Code intensively can hit Max 5x weekly limits in three to four days of unstructured all-Opus usage. The four-lane discipline above pushes that to six to seven days of equivalent work because you redirect 60–80% of the calls that donʼt need Opus 4.7 to free or already-paid alternatives.
| Task Type | Right Lane | Why | Approx. Token Volume |
|---|---|---|---|
| Architecture, planning, outlines | Gemini 3 Pro (native, free) | 1M context, strong planning, no subscription burn | 10–30K input, 5–10K output |
| Long-context refactors, full-repo edits | Opus 4.7 (Max, terminal) | 1M auto-upgrade, best long-context coding model | 200K–800K input, 50K–200K output |
| Single-file feature additions | Sonnet 4.6 (native or Max) | Cheaper than Opus, still excellent for bounded work | 20–80K input, 10–30K output |
| Code review of generated diffs | Local Ollama (qwen2.5-coder:14b) | Free, fast, private; catches obvious mistakes | Whatever fits in 32K–64K context |
| CI scripts, OpenAI integrations, PR review | Codex (ChatGPT Pro) | OpenAI strength domain, GitHub PR review built-in | Counts against ChatGPT plan, not Anthropic |
| Tab autocomplete | Local Ollama (qwen2.5-coder:1.5b) | Sub-300ms latency, zero cost, never hits the network | Local only, no cloud tokens |
| Embedded browser testing, agent loops | Antigravity native (Gemini 3 Flash) | Fastest variant in native picker, free preview | Native quota only |
| Quick syntax questions, docs lookups | Gemini CLI (free OAuth) | 60 RPM / 1000 RPD free, headless one-shot mode | Free tier |
The Pre-Flight Checklist Before Any Heavy Sprint
Before you start a session that is going to burn real subscription budget, run this checklist. It takes 30 seconds and prevents the “I just used 200K tokens on Opus to do something Sonnet would have nailed for half” outcome.
I am about to run an agentic coding session. Before I burn any premium tokens, analyze the task below and tell me: 1. Is this a planning task, an implementation task, a review task, or a specialized-domain task? 2. Which model lane is the right primary tool: Gemini 3 Pro (planning), Opus 4.7 (long-context implementation), Sonnet 4.6 (bounded implementation), Codex (OpenAI-domain), or local Ollama (validation)? 3. Estimate input tokens needed (count files I'll need to load). 4. Output a one-paragraph "plan of attack" I can paste into the chosen model. 5. List two specific risks I should call out to the implementing model to prevent obvious mistakes. TASK: <paste your task here>
Validation Gates and the Full Rollback Matrix
Every setup path in this class has a specific failure mode. The validation gates below cover the most common ones, with the exact symptom, the most likely cause, and the rollback move. Bookmark this section. Itʼs the one you come back to at midnight when something breaks.
Per-Model /status Reference
Each CLI has its own version of the status command. These are the outputs you should see on a healthy install:
# Claude Code (Max profile) claude-max /status # Healthy: # Working Directory: <project path> # Login method: Claude Max account # Email: your@email.com's Organization # Model: Default Opus 4.7 with 1M context # Output style: Default # Codex CLI codex login status; echo $? # Healthy: prints "logged in as <chatgpt account>" and exits 0 # Gemini CLI gemini --version # Healthy: 0.32.x or higher gemini -p "return only the string GEMINI_OK" # Healthy: returns GEMINI_OK # Ollama curl -s http://localhost:11434/api/version # Healthy: returns {"version":"0.x.x"} ollama list # Healthy: shows pulled models
Troubleshooting Matrix — Real Errors Captured From This Build
| Symptom | Most Likely Cause | Fix |
|---|---|---|
/status shows “API credits” not “Max plan” |
ANTHROPIC_API_KEY or ANTHROPIC_BASE_URL is set in user or system env |
Re-audit env per Step 2. Confirm claude-max launcher nulls these in function scope. |
| Model field shows 200K context, not 1M | Plan is Pro not Max, or 1M beta flag disabled | Verify Max subscription billing. Run /model, confirm claude-opus-4-7 selected. |
PowerShell: -- separator consumed before npx -y |
PowerShell parses -- aggressively in native command invocations |
Install MCP server globally first (npm install -g), register by binary name, not npx. |
PowerShell: add-json rejects JSON string |
PowerShell mangles inner double-quotes when passing to native commands | Use the global-install + add --transport stdio pattern instead. |
Antigravity Claude Code extension: Error: Command 'code.cmd' not found |
Antigravity ships its own CLI binary, not code.cmd; the extension auto-installer falls back to standard VS Code paths |
Cosmetic only on Windows. Terminal-based claude-max still works perfectly. Ignore, or install via Antigravity binary --install-extension. |
| Antigravity marketplace switch returns “Failed to fetch” | Microsoft Marketplace URL combo unreliable on some Antigravity builds; reported in Google AI Developers Forum thread 121254 | Revert to OpenVSX defaults. Manual VSIX install via Antigravity CLI binary --install-extension. |
| Codex prompts for approval on every command | approval_policy = "on-request" in config.toml |
Working as intended for safety. Use --full-auto for trusted automation, or set approval_policy = "auto" for daily dev. |
Codex sandbox blocks npm install |
Network blocked by default in sandbox_workspace_write |
One-shot enable: codex -c 'sandbox_workspace_write.network_access=true'. Permanent: edit config.toml. |
| Cline returns “context length exceeded” mid-session | Default Ollama 32K context too small for Clineʼs long system prompt | Enable Compact Prompt (Cline Settings → Features), or build extended-context Modelfile with num_ctx 65536. |
| Continue.dev autocomplete delivers nothing | Conflicting completion provider (GitHub Copilot, Tabnine) intercepting | Disable other completion extensions. Verify editor.inlineSuggest.enabled: true in Antigravity settings. |
Gemini CLI: 403 PERMISSION_DENIED |
OAuth token expired, or API key missing if you switched from OAuth to key | Run gemini --reauth to force fresh OAuth, or set GEMINI_API_KEY if using key path. |
| Antigravity native picker says it is Gemini 3 Pro but identifies as Gemini 2.0 Flash | Known Vertex Garden routing issue, reported in Google AI Developers Forum thread 120631 (Feb 2026) | Use Gemini CLI directly for confirmed model identity, or use Claude Code in terminal for sprint work where model identity matters. |
The Universal Rollback — Wipe and Restart
If a setup goes sideways and you canʼt reproduce a clean state, this is the nuclear option for each path. Notice no path requires deleting credit cards from a billing console — subscription auth makes recovery cheap.
# Claude Code Max profile reset rm -rf ~/.claude-max # Mac/Linux # Windows: Remove-Item $env:USERPROFILE\.claude-max -Recurse -Force claude-max # /login again # Codex full reset codex logout rm -rf ~/.codex codex login # Gemini CLI reset rm -rf ~/.gemini gemini # re-prompts theme + auth # Ollama nuclear: remove all models (frees disk) ollama list | tail -n +2 | awk '{print $1}' | xargs -I {} ollama rm {} # Then re-pull whatever you need # Cline reset (in Antigravity) # Cline Settings → ... menu → Reset Extension State # Continue.dev reset rm ~/.continue/config.yaml # or move it aside # Open Continue panel in Antigravity → regenerate from template
Six Vibe Coders, Six Reasons This Stack Wins
This setup pays off differently depending on who you are. The personas below cover the most common shapes; pick the one that looks most like you and the playbook becomes obvious.
Solo founder running a real business on top of synthetic employees
You ship products, run ops, and design the AI infrastructure that runs everything else. You think in systems, not snippets.
Best lane: Opus 4.7 in claude-max for everything that touches your master codebase. Gemini 3 Pro free tier for one-off experiments. Local Ollama as a sovereignty hedge for sensitive work. Skip Codex unless youʼre OpenAI-heavy.
Selling vibe-coded sites and apps to clients on retainer
You spin up multiple client environments per week. Profile isolation matters because client A canʼt see client Bʼs MCP credentials.
Best lane: One CLAUDE_CONFIG_DIR per client. Per-client .codex/config.toml. Antigravity workspaces per project. Native Gemini for client kickoff calls and proposal drafting. Bill the subscription cost into your retainer.
Stewarding a project, reviewing PRs, helping contributors
Your bottleneck is review throughput. Codexʼs PR-review mode pairs with your existing GitHub workflow and Claude handles deep technical writing for docs.
Best lane: Codex CLI for PR review automation. Opus 4.7 for documentation overhauls and long-form RFC writing. Gemini 3 Flash for quick triage of new issues. Local Ollama for embedding-based codebase search via Continue.dev.
First six months. You watched a YouTube video and bought Cursor
Cursor is fine but you donʼt know what youʼre missing. The free tier of Antigravity gets you into Opus 4.6 Thinking and Gemini 3 Pro at $0.
Best lane: Native Antigravity picker only at first. Add Gemini CLI free tier for terminal practice. Once you start hitting limits, upgrade to either Claude Pro or ChatGPT Plus and add the matching CLI. Avoid Max plans until you actually need them.
Building solo on a strict budget, no employer to expense subscriptions
Every dollar you spend on AI is a dollar not spent on hosting, domains, or coffee. Free tiers and your existing ChatGPT Plus go further than people think.
Best lane: Antigravity native picker (free) + Gemini CLI free OAuth (60 RPM / 1000 RPD) + local Ollama for autocomplete and validation. Add Codex via existing ChatGPT Plus if you have it. Defer Claude until your revenue justifies a Pro plan.
Working with regulated data, NDAs, or proprietary client code
Cloud is a no-go for some of your code. You need a legitimate local fallback that still feels like vibe coding, not 2017-era autocomplete.
Best lane: Local Ollama with Continue.dev as primary path. Cloud models for non-sensitive scaffolding only. Antigravityʼs browser allowlist locked down to internal docs. danger-full-access Codex mode never used.
Bottom Line — Should You Set This Up Right Now?
Yes. Antigravity is free during preview, the three subscription CLIs (Claude Code, OpenAI Codex, Gemini CLI) all authenticate against subscriptions you already have, and the four-lane workflow split delivers two-to-three times the throughput per dollar versus single-model usage. The setup takes 45 minutes total. You will recover that time on the first real sprint. The right move is to do all four paths now, validate each /status, and start using the workflow split immediately. Your future self will thank your present self for not waiting until the next time Claude Code or Antigravity ships a breaking change and the install instructions no longer match.
Twelve Questions Vibe Coders Ask Before Setting This Up
Each answer mirrors what would render as a featured snippet on Google or a voice answer on Gemini and Alexa. Expand a question to read the full answer with the context most setups skip.
No. You can run Claude Opus 4.7 inside Antigravity using your existing Claude.ai Pro or Max subscription with zero API charges. The path is the Claude Code extension or terminal binary, not Antigravityʼs native model picker, which does not yet include Opus 4.7. Run claude-max in Antigravityʼs integrated terminal, sign in with your subscription via OAuth, and select claude-opus-4-7 with /model. Antigravityʼs free preview also gives you Claude Sonnet 4.6 and Claude Opus 4.6 in the native picker at no cost.
Antigravityʼs native models are served via Googleʼs Vertex Model Garden, which means Anthropic releases need to ship through Googleʼs integration pipeline before they appear in the picker. As of late April 2026, Opus 4.7 has been available for over a week but Antigravity has not added it. Multiple threads on the Google AI Developers Forum are tracking the request. The workaround is to use Claude Code in Antigravityʼs integrated terminal, which uses Anthropicʼs official endpoints directly.
Zero additional cost beyond subscriptions you already have. Antigravity is free during public preview with personal Gmail. Claude Opus 4.7 runs on a Claude Max subscription ($100 to $200 per month) with no per-token charges. OpenAI Codex is included with ChatGPT Plus ($20 per month) or Pro ($200 per month) at no extra cost. Gemini CLI is free for personal Google accounts (60 requests per minute, 1000 per day). The total combined stack costs whatever you already pay for Claude and ChatGPT subscriptions, with no Anthropic Console, OpenAI API, or Google Cloud billing required.
The Claude Code OpenVSX extension provides inline diff previews, click-to-jump from terminal to source, and editor-side activity indicators. The terminal binary alone (claude-max in Antigravityʼs integrated terminal) provides full filesystem access, MCP server connections, model selection, and 1M context window on Max plans. For most workflows the terminal-only path is faster and more reliable. The extension is optional polish.
Run npm install -g @openai/codex globally, then run codex in Antigravityʼs integrated terminal. The CLI prompts you to sign in. Choose Sign in with ChatGPT, complete the browser OAuth flow, and select your API organization. The CLI uses your ChatGPT Plus subscription with no manual API key creation required. Plus subscribers get $5 in promotional API credits one-time. The flow works identically on Mac, Windows, and Linux.
Yes, but not in Antigravityʼs native model picker. The native picker does not support custom OpenAI-compatible endpoints. The recommended path is to install the Cline or Continue.dev extension via Antigravityʼs OpenVSX extension marketplace. Both extensions support pointing at http://localhost:11434/v1 for Ollama. Continue.devʼs config.yaml accepts a provider entry with apiBase set to your local endpoint. This path keeps Antigravityʼs Gemini agent for planning while routing implementation to local hardware.
For npm-based stdio MCP servers like Shopify Dev MCP, install the package globally with npm install -g @shopify/dev-mcp first, then register with claude-max mcp add --transport stdio shopify-dev-mcp shopify-dev-mcp. This bypasses PowerShellʼs known issues with passing the -y flag through the -- separator. For HTTP-based MCPs, use claude-max mcp add --transport http <name> <url> with optional --header for auth. WordPress.com is added through claude.ai Settings to Connectors directory using OAuth 2.1 and requires a paid WordPress.com plan.
Use isolated profiles per model and per project. For Claude Code, set CLAUDE_CONFIG_DIR to a per-project directory before each session so subscription credentials, MCP server lists, and tool permissions cannot leak across projects. For Codex, configure approval policy to on-request and sandbox_mode to workspace-write in ~/.codex/config.toml so commands require approval before execution. For all models, restrict Antigravityʼs browser URL allowlist in Settings to known-safe domains, restart after changes, and use the three-tier terminal permission system rather than full bypass mode.
Antigravity is currently free during public preview and supports Gemini 3.1 Pro, Claude Sonnet 4.6, Claude Opus 4.6, and GPT-OSS 120B in its native picker. Cursor charges $20 to $200 monthly with native Claude, GPT, and Gemini support and offers up to 8 parallel agents and Plan Mode. Windsurf is paid and Codeium-backed, also a VS Code fork. For vibe coding specifically, Antigravityʼs Manager view and parallel agent orchestration give it a structural advantage at the free tier. The tradeoff is that Antigravity uses OpenVSX by default, which lags behind the official VS Code Marketplace, and rate-limit complaints have been reported through early 2026.
No. Claude Code and Codex CLI sessions running in Antigravityʼs integrated terminal authenticate directly with Anthropic and OpenAI respectively. They consume your Claude Max or ChatGPT Pro allocation, not Antigravityʼs quota. Only Antigravityʼs native model picker draws against Antigravityʼs free preview limits. This is the entire reason the multi-model setup is valuable. You preserve your Antigravity allocation for Gemini-driven planning and agentic browser tasks while running implementation work against your subscription-paid Anthropic and OpenAI quotas.
Open Antigravity Settings with Ctrl+comma or Cmd+comma, navigate to Editor, and update the Marketplace Item URL and Marketplace Gallery URL to point to https://marketplace.visualstudio.com/items and https://marketplace.visualstudio.com/_apis/public/gallery respectively. Restart Antigravity. Note that this technically may violate Microsoftʼs Terms of Service, so use at your own discretion. An alternative is to install extensions manually via the Antigravity CLI binary using --install-extension with a downloaded VSIX file from the official VS Code Marketplace.
Antigravity itself recommends 8GB RAM minimum. For cloud-only paths (Claude Opus 4.7, Gemini 3 Pro, GPT-5.x via Codex) any modern laptop with 16GB RAM and a current OS works. For the local Ollama path, you need a GPU with at least 12GB VRAM for 14B-parameter models like qwen2.5-coder:14b at usable speed. An RTX 3060 12GB handles 14B models in Q4 quantization with 32K context. An RTX 4090 or 3090 24GB handles 32B models comfortably. Mac users with Apple Silicon and 32GB unified memory run 14B-32B models well via MLX through LM Studio.
Bring This Setup Home, Then Come Back for the Next Master Class
The DDS Vibe Academy is the methodology library behind every system, page, and synthetic employee shipping at Design Delight Studio. New master classes drop monthly. The next one teaches the Sovereign Orchestrator pattern that turns this multi-model setup into an autonomous agent fleet that ships work while you sleep.
Open the Vibe Academy See What This Stack Built