PROMPT_01
Audit My Stack vs April Releases
Audit my current LLM stack against April 2026 releases. For each model I currently use, recommend the cheapest April-released alternative that meets quality. Specifically evaluate: Claude Opus 4.7 ($5/$25, +13% on hardest tasks), Gemini 3.1 Pro ($2/$12), GPT-5.5 ($5/$30, agentic-tuned), and DeepSeek V4-Flash (~35x cheaper). Output as a migration table with estimated monthly savings.
PROMPT_02
Test the Opus 4.7 Tokenizer Cost Impact
Run my standard 100-prompt eval set on both Claude Opus 4.6 and Opus 4.7. Capture: total input tokens, total output tokens, total cost, average response quality (LLM-as-judge). The new 4.7 tokenizer may use up to 35% more tokens for the same text. Determine whether the +13% quality lift on hardest tasks offsets the per-token cost increase for my workload.
PROMPT_03
Benchmark GPT-5.5 vs Opus 4.7 on My Workload
Run my agentic coding eval set on both Claude Opus 4.7 ($5/$25) and GPT-5.5 ($5/$30). Capture quality, latency, token efficiency. GPT-5.5 is more token-efficient per-task than 5.4, but Tom's Guide reported it lost all 7 categories to Opus 4.7. Validate against my actual workload, not anecdotal benchmarks. Output recommendation with confidence interval.
PROMPT_04
Test Gemini 3.1 Pro Long-Context Pricing Trap
For my long-document Q&A workload, calculate cost on Gemini 3.1 Pro at three scenarios: (1) ≤200K context at $2/$12, (2) >200K context at $4/$18, (3) Anthropic Sonnet 4.6 flat-rate 1M at $3/$15. For prompts with average input size N, find the breakeven where Anthropic flat-rate beats Gemini's premium tier.
PROMPT_05
Drop In DeepSeek V4-Flash via Anthropic Compat
Migrate my Claude Sonnet 4.6 workload to DeepSeek V4-Flash via the Anthropic API compat layer. Keep base_url, change model to deepseek-v4-flash. Run the same eval set, capture quality + cost. V4-Flash is roughly 35x cheaper than GPT-5.5 and has 1M context with MIT license. Output the migration diff and the % cost reduction.
PROMPT_06
Self-Host DeepSeek V4-Flash on RTX 4090
Set up DeepSeek V4-Flash (284B total, 13B active params) for self-hosting on a multi-GPU rig with RTX 4090s. Use 4-bit quantization (UD-Q4_K_XL) via llama.cpp or vLLM. Configure 1M context window with the Hybrid Attention Architecture. Document the per-token inference cost vs API pricing and the tokens/sec benchmark.
PROMPT_07
Add Qwen 3.6-35B-A3B to Sovereign Orchestrator Pro
In Sovereign Orchestrator Pro V5.0, add qwen3.6-35b-a3b as an Ollama-side option for the Veritas (BRAND_INQUISITOR) and Scribe (CONTENT_GENERATOR) agents. The model has 3.5B active params from 35B total — fits comfortably on RTX 3060 12GB VRAM. Set Hybrid Defaults so local Qwen runs first, Gemini Flash 2.5 as cloud counterpart. Validate per-agent assignment.
PROMPT_08
Wire Up Kimi K2.6 via Ollama Launch
Set up Moonshot Kimi K2.6 (1T param Agent Swarm) using the native Ollama integration: 'ollama launch kimi --model kimi-k2.6:cloud'. Add to my Sovereign Orchestrator Pro stack as a multimodal alternative (text + image + video). Test thinking mode vs non-thinking mode on a representative agentic task. Confirm the 100M MAU commercial-use threshold for my deployment.
PROMPT_09
Run Mistral Medium 3.5 on European GDPR Workload
For my GDPR-bound European customer support workload, evaluate Mistral Medium 3.5 ($1.50/$7.50) against Claude Sonnet 4.6 ($3/$15). Test: response quality, latency, data residency compliance, multilingual handling (FR, DE, ES, IT, PT). Validate Mistral Workflows (Temporal-based) for the orchestration layer with human-in-the-loop approvals. Output deployment recommendation.
PROMPT_10
Compare Grok 4.3 Native Video vs Gemini 3.1 Pro
For my video analysis workflow, compare Grok 4.3 Beta (native video input, $300/mo SuperGrok Heavy) against Gemini 3.1 Pro multimodal ($2/$12 per MTok). Run 20 representative video clips through each: capture accuracy, latency, cost-per-video. Note Grok 4.3 has no persistent memory at $300/mo. Output decision matrix.
PROMPT_11
Migrate to Cursor 3 Agents Window
Upgrade my Cursor installation to Cursor 3 (April 2). Type Cmd+Shift+P → Agents Window. Configure parallel agents across 3 panes: Pane 1 refactors a module, Pane 2 writes tests, Pane 3 updates docs. Use isolated git worktrees so they merge cleanly. Set Composer 2 ($0.50/$2.50) as default in Auto mode. Test the local-to-cloud handoff for long-running tasks.
PROMPT_12
Try Cursor Design Mode for UI Iteration
Open Cursor 3's Design Mode in the Agents Window. Annotate the broken UI elements directly in the rendered browser preview. Have the agent target those specific elements for fixes. Compare the iteration speed against describing UI bugs in chat. Use Cursor 3.1's voice dictation (Ctrl+M) for feedback during the session.
PROMPT_13
Build with the Cursor SDK (TypeScript)
Using the Cursor SDK (public beta), build a TypeScript script that creates a durable agent: import { Agent } from "@cursor/sdk"; Agent.create({ apiKey, model: { id: "composer-2" }, local: { cwd: process.cwd() } }); Stream events via run.stream(). Add archive, unarchive, and permanent delete lifecycle controls. Test SSE event reconnect via Last-Event-ID.
PROMPT_14
Install Zed 1.0 and Configure ACP Agents
Install Zed 1.0 (April 29). Configure the Agent Client Protocol (ACP) to bring in Claude Agent, Codex, OpenCode, and Cursor agents — all via the same protocol. Connect MCP servers for extended tool access. Switch panel layout (User Menu → Panel Layout → Agentic) to put the agent panel on the left. Test parallel agent execution on a multi-file refactor.
PROMPT_15
Add DeepSeek V4-Pro/Flash to Zed
Configure Zed 1.0 to use DeepSeek V4-Pro and V4-Flash as agent models (added April 29). Set V4-Pro as the planning agent (1.6T total / 49B active, Think Max mode for hard problems) and V4-Flash as the executor agent (284B / 13B active). Validate the 1M context window with a large codebase ingestion test.
PROMPT_16
Use Antigravity Walkthroughs for Project Init
In Google Antigravity (April Update), open Manager Surface and start a Walkthrough for a new TypeScript + React + Tailwind project. Step through: directory scaffolding, package.json, tsconfig, ESLint config, env var setup, deployment pipeline. Compare time vs the previous "back-and-forth Gemini chat" workflow. Document any rate-limit issues from the credit system.
PROMPT_17
Test Antigravity Browser Agent Stability
In Antigravity April Update, test the Browser Agent improvements: (1) form input + submission as separate steps (inspect filled form before submit), (2) multi-tab workflow stability (GitHub Actions in Tab A, Vercel deploy in Tab B without losing state), (3) jQuery/React init detection on slow-loading SPAs. Document any regressions vs the March version.
PROMPT_18
Adopt Claude Code April Quality Patches
Upgrade to Claude Code v2.1.116 or later (April 20 patches). Verify the three quality regressions are fixed: (1) restored higher default reasoning effort, (2) caching bug repaired, (3) verbosity prompt reverted. Run my standard agentic eval set against the patched version and compare against pre-April 10 baseline. Document quality recovery in numbers.
PROMPT_19
Switch to ant CLI for Shell-First Workflows
Migrate my Claude Code shell workflows to the new ant CLI (April public beta). Versioning API resources in YAML files. Test native Claude Code integration. Compare ergonomics against the SDK for: (1) batch script generation, (2) repo-wide refactors, (3) CI/CD agent triggers. Document which workflows are better served by ant CLI vs the SDK.
PROMPT_20
Try Windsurf 2.0 Spaces + Embedded Devin
In Windsurf 2.0 (April 15), set up a Space for an 8-hour autonomous refactor task. Hand off to the embedded Devin cloud agent. Use Devin Review for PR analysis (~30% more issues caught). Compare end-to-end completion vs Cursor 3 Cloud Agents and Claude Code Routines on the same task. Output the cost + quality + completion-rate comparison.
PROMPT_21
Build an MCP App (SEP-1865)
Build an MCP App following the SEP-1865 specification. Server exposes a React-based dashboard for inventory data (charts, filters, drill-down). Host it via streamable HTTP transport with a .well-known metadata endpoint. Test rendering in Claude Desktop and ChatGPT. Document the per-host UI consistency vs writing one-off integrations.
PROMPT_22
Deploy MCP Server with Streamable HTTP
Deploy an MCP server using Streamable HTTP transport (the production-recommended path per the 2026 roadmap). Configure stateless session handling for horizontal scaling behind a load balancer. Add a .well-known metadata endpoint so registries and crawlers can discover capabilities without a live connection. Test against Claude Desktop, ChatGPT, and Zed.
PROMPT_23
Launch a Claude Managed Agent
Use Claude Managed Agents (April public beta) with the managed-agents-2026-04-01 header. Create an agent with secure sandboxing, built-in tools (bash, text editor, code execution), and SSE streaming. Configure Memory ($0.08/session-hour while running) for long-context recall across sessions. Compare ops burden against rolling your own agent harness.
PROMPT_24
Add 21st.dev MCP for Crafted UI Components
Wire up the 21st.dev MCP server to my Claude Code or Cursor setup. Use it to generate UI components inspired by the best design engineers (instead of generic Tailwind from training data). Test on a landing page hero section, a pricing table, and a feature grid. Compare visual quality and design consistency vs un-augmented Claude Sonnet 4.6 output.
PROMPT_25
Wire Up Persistent Memory + Guardrails MCP
Install the persistent memory + guardrails MCP server for Claude Code (runs locally with Ollama). Features: mistake tracking, loop detection, scope guard, hooks that block risky edits. Configure scope rules: never edit files outside /src, never delete tests, never push to main. Test by intentionally giving Claude a destructive instruction and confirming the guard blocks it.
PROMPT_26
Use Aggregator MCP to Unify 10 Servers
Install an Aggregator MCP server that unifies multiple MCP servers into one. Configure 10 source servers (filesystem, git, GitHub, Postgres, Slack, Notion, Stripe, Linear, Cloudflare, AWS). Validate that all tools surface through the single aggregator endpoint. Test latency overhead vs direct connection. Document tool name collision handling.
PROMPT_27
Build a Voice Agent on Grok STT/TTS
Build a voice agent using Grok Speech to Text + Text to Speech APIs (April 23): $0.10/hr batch, $0.20/hr streaming for STT. Pipeline: speech-in → STT → Claude Haiku 4.5 (cheap reasoning) → TTS → speech-out. Test on 100 conversational turns. Compare cost-per-minute against ElevenLabs + Deepgram + OpenAI Realtime mini. Add multilingual support across 5 of the 25+ supported languages.
PROMPT_28
Generate UI Mockups with gpt-image-2 Thinking Mode
Use gpt-image-2 Thinking mode (April 21) to generate an 8-panel UI mockup sequence with consistent characters, brand palette, and typography. Test the multilingual text rendering (English + Spanish + Japanese in same image). Compare against Gemini 3 Pro Image and Imagen 4 Ultra on the same prompt set. Output recommendation for branded marketing assets.
PROMPT_29
Use Mistral Workflows for Document Pipelines
Set up Mistral Workflows (Temporal-based, April public preview) for a document processing pipeline. Each step: (1) ingest doc, (2) Mistral Medium 3.5 extract entities, (3) human-in-the-loop approval gate, (4) downstream API call, (5) audit log. Validate durability: kill the worker mid-pipeline, restart, confirm execution resumes from last completed step. Compare against custom Lambda + Step Functions implementation.
PROMPT_30
Update Sovereign Orchestrator Pro to April Stack
In Sovereign Orchestrator Pro V5.0, update per-agent assignments to April 2026 best-of-breed: Veritas (BRAND_INQUISITOR) → qwen3.6-35b-a3b (local) + Gemini Flash 3 ($0.15/$1.00); Atlas (STRATEGY_AGENT) → DeepSeek V4-Pro Think Max (local via Ollama) + Gemini Pro 3.1 Preview ($2/$12); Scribe (CONTENT_GENERATOR) → kimi-k2.6:cloud (Ollama Cloud) + Gemini Flash 2.5; Swarm Orchestrator → qwen2.5-coder-14b-32k (local) + Flash-Lite 2.5; Tweet Generator → gemma4:latest (local) + Flash-Lite 2.5. Save and toggle Hybrid Defaults to verify each agent runs local-first.