The Vibe Build Loop: Context Engineering, Agentic Engineering, and Kno

Quick Answer

The vibe build loop is the six-phase production cycle that turns intent into shipped software: spec, context-load, generate, audit, iterate, ship. It combines the paradigm, environment, specification, and review disciplines from Foundation 01 through 04 into a single repeatable practice. The loop is the difference between a prototype that worked once and a system that ships reliably at the speed AI makes possible. Context engineering, named by Andrej Karpathy and Tobi Lutke, is the discipline that makes the loop scale. Agentic engineering, Karpathy's February 2026 reframe, is the professional name for the work this class teaches.

Section 01

The Whole Loop in One Diagram

Everything in this class fits inside one cycle. Read it once now; the rest of the class explains each phase in depth. The phases are six, not five and not seven, because each one represents a distinct change in what the architect is doing. Skipping any of them turns the loop back into a one-shot prompt, and one-shot prompts hit the 500-line wall by Tuesday.

The six-phase loop

1. Spec. Write the eight-block specification from Foundation 03. The spec is the contract the rest of the loop holds to. 2. Context-load. Curate what the agent sees: CLAUDE.md, applicable rules and skills, the spec, the relevant code slice. Not a codebase dump. 3. Generate. Run the agent. For non-trivial work, generate the failing test first, then let the agent implement until it passes. 4. Audit. Run CI, the four-pass diff review, and the adversarial-review prompt from Foundation 04. 5. Iterate. If audit finds an issue, fix the spec or the test, not the code through conversation. Regenerate from the corrected artifact. 6. Ship. Merge, deploy, then fold what you learned back into CLAUDE.md, the skills, and the rules. The loop compounds across cycles.

Read each phase carefully. The phases are not equally hard. The Spec phase is the longest and most cognitively expensive for the architect because it converts intent into an artifact. The Generate phase is the shortest in wall-clock time because the AI does the typing. The Audit phase is where most architects underinvest and where most issues hide. The Iterate phase is where the temptation to "just fix it in conversation" lives, and where the architect's discipline either holds or collapses. The Ship phase is anticlimactic on purpose; if the prior phases were done right, ship is a button.

The reason the loop produces reliable software is that each phase has a single job and a single artifact. The Spec phase produces the spec. The Context-load phase produces the prompt. The Generate phase produces the diff. The Audit phase produces the findings list. The Iterate phase produces an updated spec or test. The Ship phase produces the merged commit and the updated project memory. None of those artifacts are conversational. All of them are reviewable, durable, and improvable. That is the entire premise of professional AI coding.

Section 01 of 13

Section 02

Why the Loop Beats the One-Shot

The intuition many beginners carry is that a smart model with a smart prompt produces the right result in one pass. For toys, sometimes. For real software, almost never. Andrew Ng documented the formal version of this finding in 2024: wrapping a less capable model in a well-designed agentic loop can outperform a more capable model used in zero-shot mode. Anthropic formalized the same observation in Building Effective Agents: the control loop around the model often matters more than the model.

The mechanism is straightforward. A one-shot prompt asks the model to do everything in a single inference: understand intent, plan, design, implement, verify. Modern models are good at this for small tasks. They are mediocre at it for non-trivial ones because the inference is monolithic. The loop decomposes the same task into phases the model can do well one at a time, with human review between phases. Each phase carries less ambiguity than the whole. The result is reliability at the cost of throughput, and the throughput cost is much smaller than the reliability gain.

The agentic loop, in Anthropic's frame

Anthropic distinguishes two patterns that often get conflated. A workflow is when the developer controls the flow through predefined code paths and the LLM fills in specific steps; the path is known, the model fills the gaps. An agent is when the LLM dynamically directs its own process and tool use; the path is emergent. Most production systems live between the two: structured enough to be reliable, flexible enough to handle variance. The vibe build loop is fundamentally a workflow with agent phases. The Generate phase is agentic; the rest of the loop is scripted by the architect.

When to use a workflow, when to use an agent

Use a workflow (scripted phases, fixed order) for the entire build cycle, because the architect's discipline is what holds quality. Use an agent (dynamic tool selection, free-form planning) inside the Generate phase, because that is where the AI's autonomy produces leverage. The mistake new architects make is letting the agent drive the whole loop, which produces unaccountable output. The mistake experienced architects sometimes make is scripting the Generate phase too tightly, which kills the leverage. The build loop draws the line in the right place by design.

Section 02 of 13

Section 03

Context Engineering, the Named Discipline

In mid-2025, Andrej Karpathy and Shopify CEO Tobi Lutke independently named what production AI teams had been doing for months: curating what the model sees, not what the model is. They called it context engineering, and the name stuck because it captured the real performance frontier. Models had largely converged in capability by 2026; the difference between teams who shipped reliable AI work and teams who did not was in the discipline of the context they fed their agents.

The discipline has a precise definition. Birgitta Bockeler at Thoughtworks framed it cleanly in February 2026: context engineering is curating what the model sees so that you get a better result. Not curating the prompt (Foundation 03). Not curating the project memory (Foundation 02). Curating the combination that lands in front of the model at the moment of generation. Spec plus relevant CLAUDE.md plus applicable rules plus active skills plus retrieved code plus examples. The whole information environment of the inference.

The ACE paper and the 86 percent finding

The empirical case for context engineering as a discipline crystallized with the ACE paper (Agentic Context Engineering), published October 2025 by researchers at Stanford and SambaNova. The paper demonstrated that incremental, structured context updates can reduce agent drift and latency by up to 86 percent compared to static or regenerated prompts. The practical implication is sharper than the number suggests: context, not model size, is the real performance frontier for production agents in 2026.

The ACE finding inverts the default assumption that has guided AI adoption since 2023. The default assumption was that bigger models would solve the reliability problem. They did not. Bigger models got faster and broader, but their failure modes were still failure modes of attention and instruction-following. Better context engineering closed more of the reliability gap than every model generation since GPT-4. The architect who masters context engineering today is positioned for the next generation of models, because the discipline scales with model capability rather than being obsoleted by it.

Tool-specific context surfaces

Different tools expose context engineering through different files. Claude Code uses CLAUDE.md plus Skills under .claude/skills/. Cursor uses .mdc files under Project Rules. GitHub Copilot uses copilot-instructions.md. The mechanics differ; the discipline is the same. The architect maintains a small, dense context surface in whichever file the tool reads, and curates per-task additions through the prompt. The five strategies in the next section work regardless of which file the tool prefers.

Section 03 of 13

Section 04

The Five Context Strategies

Context engineering decomposes into five strategies. Each is a separate move, and each is a separate skill. The architect who masters all five outperforms the architect who relies on any one of them in isolation. Learn the names; the names make the discipline teachable to teammates and reviewable to yourself.

Strategy 1: Selection — only what matters

The first context move is choosing what to include. Most beginners over-include, dumping the entire codebase or every related file into the prompt on the theory that more context is better. The opposite is true past a small threshold. The model's attention is finite; every additional file dilutes the signal of the relevant ones. Selection is the discipline of choosing the few files, the few lines, the few examples that actually inform the task. Cutting context that does not earn its place is the single highest-leverage move available.

Strategy 2: Compression — smaller is sharper

What you cannot omit, compress. Long documents become summaries. Verbose specifications become bullet points. Wide examples become narrow examples that cover the same variation. Compression preserves signal while reducing token cost and attention dilution. The eight-block specification from Foundation 03 is itself a compression discipline: every block earns its place, every line is a decision the AI no longer has to guess. Apply the same scrutiny to every piece of context you feed.

Strategy 3: Ordering — operative instructions last

The model attends most strongly to content near the boundaries of its context window, with the very end given the most weight. Foundation 03 named this for prompts; the same principle applies to entire context payloads. Long context blocks (a document, a codebase excerpt, retrieved chunks) go in the middle. The operative instructions go at the end. Tool definitions and reference material go near the beginning where they prime the model without dominating attention.

Strategy 4: Isolation — subagents for noisy work

Some work is inherently noisy: deep research, exploratory debugging, scanning a large codebase for a pattern. Doing this work in the main agent session pollutes the context with intermediate findings that are not part of the final answer. The fix is isolation: spawn a subagent with its own context, let it do the noisy work, return only the conclusion to the main session. Claude Code's subagent feature is the productized version of this strategy; the same pattern works manually by running research in a separate session and pasting only the summary back.

Strategy 5: Format optimization — structure beats prose

Structured context outperforms prose context of equivalent information content. Tables outperform paragraphs. XML tags outperform unmarked sections. Numbered lists outperform comma-separated phrases. The mechanism is that structure gives the model anchors to attend to and boundaries to respect. Two of the same paragraph (one prose, one structured) produce measurably different generation quality. Spending five extra seconds to format context as structure is one of the highest hourly-rate moves an architect makes.

The five strategies in one sentence each

Selection: cut what does not earn its place. Compression: shrink what cannot be cut. Ordering: put the operative instructions last. Isolation: send the noisy work to a subagent. Format: structure beats prose at equivalent information content. Print these five lines and keep them visible. The discipline gets faster as the strategies become reflexive.

Section 04 of 13

Section 05

Karpathy's 2026 Reframe: Agentic Engineering

On the one-year anniversary of the term he coined, in February 2026, Andrej Karpathy proposed a more precise name for what professional teams were doing. The original term, vibe coding, captured an exploratory, forgiving mode of work: describe what you want, accept what comes back, iterate by describing changes. The new term, agentic engineering, captured the deliberate, professional version: design specs, supervise plans, inspect diffs, write tests, create evaluation loops, manage permissions, isolate worktrees, preserve quality.

The two are not rivals; they are stages. Vibe coding raises the floor: anyone can prototype. Agentic engineering preserves the ceiling: the output is professional-grade. Most of the DDS Vibe Academy's teaching points to the second stage, while honoring the first as the on-ramp. The Foundation track is the path from vibe coding to agentic engineering, and this Class 05 is the destination.

What an agentic engineer actually does

Karpathy's reframe came with a job description that is worth quoting almost verbatim. The agentic engineer designs specifications. Supervises plans before they execute. Inspects diffs after they generate. Writes tests that define done. Creates evaluation loops that catch regression. Manages permissions so agents do not have more access than the work requires. Isolates worktrees so parallel agents do not collide. Preserves correctness, security, taste, and maintainability while letting the agent do the typing.

The worked example Karpathy used to illustrate the difference is worth knowing. He described a payment-system bug in a demo app called MenuGen where the user could log in with their Google account (one email) but pay with Stripe (a different email). The vibe-coding mode missed the bug because both flows worked in isolation. The agentic-engineering mode caught it because the spec explicitly stated that the same identity must persist across the auth and payment flows. The bug was real, subtle, and exactly the kind that vibe coding ships and agentic engineering catches.

One raises the floor, so anyone can prototype. The other preserves the ceiling, so the output is professional-grade. They are not rivals. They are stages. — The vibe coding to agentic engineering arc, in plain terms

Why the name matters

Naming the discipline did three things. First, it gave teams a vocabulary to distinguish the prototyping work from the production work, which had been collapsed under a single label. Second, it gave the production work a respectable name; "vibe coding" sounded unserious to executives, and that perception was slowing real adoption inside enterprises. Third, it acknowledged that the practice had evolved. The work in 2026 was not the work in early 2025, even when the same architect was doing it with the same tools. The label evolved because the practice did.

Section 05 of 13

Section 06

The Workflow versus Agent Distinction (When to Script, When to Let It Drive)

Anthropic's Building Effective Agents draws a line that every architect should be able to apply on demand. A workflow is when the developer controls the flow through predefined code paths and the LLM fills in specific steps. An agent is when the LLM dynamically directs its own process and tool use. The two have different failure modes, different cost profiles, and different reliability characteristics. Choosing the right one for the work is one of the most important architectural decisions an agentic engineer makes.

Workflows are predictable, debuggable, and observable. Each step is a known path. When something goes wrong, the architect can identify which step failed and why. Workflows are best when the structure of the work is known: an order, a sequence, a set of validations, a transformation pipeline. Most of the build loop in this class is a workflow.

Agents are flexible, exploratory, and capable of handling variance the architect did not anticipate. The model picks the tools, picks the order, picks when to stop. Agents are best when the structure of the work is emergent: exploring a codebase to find a bug, researching a topic, completing a task whose steps depend on what is found along the way. The Generate phase of the build loop is agentic by design.

When the work is…	Use a workflow	Use an agent
Known sequence with variable steps	Yes	No
Exploratory research or debugging	No	Yes
High-stakes production change	Yes (audit and rollback gates)	Only inside the workflow's generate step
Time-bounded prototyping	Optional	Yes (faster)
Reproducibility required	Yes	No (path is non-deterministic)
Tool selection depends on findings	Hard to script	Yes (agent decides)

The architect's rule: workflows for the spine of the work, agents for the steps inside the workflow where autonomy produces leverage. The vibe build loop is the canonical example. The six phases are a workflow. The Generate phase inside Phase 3 is agentic. The Audit phase inside Phase 4 uses an adversarial agent as the second pass. None of this is contradictory; it is the right tool for each subtask. Knowing which subtask is which is the architectural skill.

Section 06 of 13

Section 07

Phasing Large Builds (How This Academy Was Made)

One spec can describe a small feature. No spec describes a project. The architect's move for projects larger than a few hundred lines is to decompose the project into phases, each producing a working artifact, each running the full vibe build loop. This is the same pattern the DDS Vibe Academy was built with, and the example is worth walking through because it makes the abstraction concrete.

The Academy build, by phase

Phase 1: the V2 template framework. One spec, one CLAUDE.md context-load, one Generate pass producing the CSS framework, one Audit pass at three viewports, ship. Working artifact: a reusable template that every class would inherit from. Phase 2: the chrome strip. Same loop, different artifact. Phase 3: Foundation Class 01 content built on the V2 template. Phase 4: Foundation Class 02. Phase 5: Class 03. Phase 6: Class 04. Phase 7: Class 05, which you are reading. Each phase under 200KB. Each phase shippable on its own. Each phase fed forward into the next: lessons from Phase 4's build became improvements in Phase 5's context.

The reason phasing works is that short specs outperform long ones. The model holds 200 lines of specification with sharp attention. It holds 2,000 lines with degraded attention. The same architect producing a 2,000-line monolithic spec for the whole Academy would have produced worse output than the same architect producing seven sequential 200-line specs. The math is not subtle; it is the same attention budget, applied differently.

How to phase

Identify the spine. What artifacts must exist for the project to work? The Academy's spine was a template plus five classes plus a chrome strip. Seven phases.
Order by dependency. What must be built first because later phases depend on it? The template before any class. The chrome before any class. The classes in numerical order so each could reference the prior.
Spec each phase. One eight-block specification per phase. Short, sharp, with concrete done criteria.
Loop each phase. Six-phase build loop for each project phase. Audit, ship, fold lessons forward.
Compose at the end. The phases combine into the project. Composition is the last step, not the first.

The lesson Class 05 learned from Classes 01-04

Class 01's build taught the V2 template needed a single grid that never breaks. Class 02's build taught that {{ }} placeholder syntax in code blocks must be wrapped in raw tags to survive Liquid parsing. Class 03's build taught that CSS ids cannot start with a digit. Class 04's build was clean because the lessons from 01, 02, and 03 had become pre-flight checks. This class is the cleanest of the five because the loop compounded the lessons. That compounding is the whole point of phasing.

Section 07 of 13

Section 08

The Test-First AI Loop (In Practice)

Foundation 04 introduced test-first AI development as the discipline that catches the happy-path-only failure cluster. This section puts it inside the build loop. The architect writes the failing test before the implementation. The agent implements until the test passes. The test is the architect's acceptance criteria made executable; the AI cannot fake passing it.

The full sequence inside the build loop is:

Spec phase: write the eight-block specification. The <done> block contains the criteria that will become the tests.
Context-load phase: load the spec, the CLAUDE.md, the relevant skills, the existing test setup, and the file the new test will live in.
Generate phase, part A: ask the agent to write the failing test from the <done> criteria. Run the test. Confirm it fails for the right reason.
Generate phase, part B: ask the agent to implement the feature until the test passes. The agent iterates internally until the test goes green; you watch the trace.
Audit phase: the test is green, but the implementation still gets the full four-pass review and the adversarial-review prompt. Passing a test is necessary, not sufficient.
Iterate phase: if the audit finds an issue the test did not catch, add a test for that issue first, then regenerate the implementation.
Ship phase: merge. The test stays as regression protection.

Do not let the same agent write the test and the implementation

The test is the architect's acceptance criteria. If the same agent that writes the implementation also writes the test, the test reflects what the agent thinks the code does, not what the spec says it should do. The test passes by construction. To avoid this, have the architect write the test (with help from a separate AI session if desired), then hand the test plus the spec to the implementing agent. The split makes the test a real check, not a self-confirming mirror.

Section 08 of 13

Section 09

The Adversarial Review Pass Inside the Loop

Foundation 04 introduced the adversarial-review prompt as the second pair of eyes that catches what the first pair misses. Inside the build loop, the adversarial pass is the second half of the Audit phase, after the four-pass human review. The architect runs the diff through a different AI session, ideally a different model family, with the canonical adversarial-review prompt from Foundation 04 Section 07.

What makes the adversarial pass effective inside the loop, not just as a standalone review, is that the architect now has the full context: the spec, the audit findings, the adversarial findings. Issues get traced to one of two sources. Either the spec was missing something (fix in the Iterate phase: extend the spec, regenerate) or the implementation deviated from the spec (fix in the Iterate phase: re-prompt against the same spec, not through conversation). The adversarial pass surfaces issues; the loop's structure routes them to the right fix.

This is the part of the loop where the architect's judgment matters most. Adversarial findings are not commands. They are observations to evaluate. Some are genuine. Some are false positives. Some are nitpicks. The architect reads the list, decides which to act on, documents the dismissals (briefly, in the PR checklist's adversarial-findings block from Foundation 04), and moves to the Iterate phase or to Ship.

Cross-family review is the multiplier

If the Generate phase used Claude, run the adversarial pass on Gemini or GPT-OSS. If Generate used Gemini, review with Claude. Failure modes are not perfectly correlated across model families, so cross-family review catches issues a same-family review would miss. If your team is locked to one provider, at minimum use a fresh session with no prior context. This is the cheapest reliability move available; the only cost is one extra inference call.

Section 09 of 13

Section 10

The 79/11 Gap (Why Most Teams Cannot Ship)

Industry surveys in 2026 found that approximately 79 percent of enterprises have adopted AI agents in some form, but only about 11 percent are running them in production. The 68-point gap is the central problem of the field right now. It is not a demand problem; demand is settled. It is a skills and architecture problem: teams know they want agents and cannot reliably ship with them.

The 68 points are exactly the territory the Foundation track has been mapping. Teams hit the gap when they adopt AI generation without adopting the review discipline (Class 04). Teams hit it when they treat prompts as wishes instead of specifications (Class 03). Teams hit it when they install four tools in the first weekend without learning one (Class 02). Teams hit it when they conflate the prototype mode with the production mode (Class 01). Teams hit it when they skip the loop and rely on one-shot generation for non-trivial work (this class).

The cost of each iteration

Two industry findings make the production gap concrete. First, roughly 72 percent of AI-generated code requires manual fixes before it can be merged. Second, developers context-switch an average of 3.2 times per AI interaction. Both numbers are productivity drains that the unwary architect treats as the cost of doing business. Both numbers shrink dramatically when the architect runs the full build loop instead of one-shot prompting. The loop converts most of the manual fix work into adversarial-review findings and most of the context switching into structured iteration.

The architect's market

The 68-point gap is the labor-market reality for architects in 2026. Enterprises are buying agents. Most cannot ship with them. The architect who can run the full vibe build loop reliably, who has internalized context engineering, who knows when to refine and when to ship, is the scarcest professional in the field. The Foundation track is the credential for that role. The Development and Mastery tracks deepen it.

Section 10 of 13

Section 11

Knowing When to Ship (The Hardest Skill)

The architect's last hard skill is knowing when the work is done. Specifying is teachable; reviewing is teachable; the loop is teachable. Knowing when the iterate phase should end and the ship phase should begin is harder because the temptation is always to refine once more. One more refine is rarely wrong on the merits. It is wrong on the budget.

The ship criterion is concrete: the work meets the spec's <done> block, the tests pass, the four-pass review is clean, the adversarial pass shows no critical or high findings. If those four are true, the work is shippable. Anything beyond is not refinement; it is procrastination. The discipline is to articulate, before each additional iteration, what specifically is wrong and what specifically the iteration will fix. If you cannot say both, you are not iterating; you are stalling.

The three traps that delay shipping

Trap one: the perfection refactor. The code works, but you spot a cleaner pattern. Resist. Note the pattern, ship the working version, refactor in the next loop with a real spec entry for the refactor. Refactoring inside an Iterate phase blurs the diff and makes the audit harder.

Trap two: the speculative edge case. You think of a case that might happen, that the spec did not name, that the tests do not cover. Resist. Add it to the spec for v1.1 if it is real. Adding cases mid-loop turns a six-phase cycle into an open-ended one.

Trap three: the comparison loop. You see how the agent built the thing and think there might be a better way. Resist. The agent built it to the spec. If the spec was right, the implementation is acceptable. Comparison is for the next spec, not this one.

The ship-then-fix doctrine

A v1 shipped today and improved in v1.1 next week beats a v1 polished forever. The loop is designed to make v1.1 cheap: the spec, the tests, the CLAUDE.md, the skills all carry forward. Shipping is not the end of the work; it is the start of the next loop. The architect who internalizes this doctrine ships an order of magnitude more software than the architect who optimizes every v1 before release.

Section 11 of 13

Section 12

The Architect's Ongoing Practice

Foundation classes end. The practice does not. Five durable habits define the architect's ongoing work. None are heroic. All of them compound. The architect who does all five reliably outperforms the architect who chases one expensive new tool every month.

Habit 1: Maintain a CLAUDE.md that earns every line.

Open your project's CLAUDE.md once a month. Read it as a hostile reviewer. Cut every line that the linter, the type checker, or the CI gate already enforces (the Toolchain First principle from Class 02). Add lines that capture decisions made since the last read. Keep the file under 200 lines. The discipline of pruning is what keeps the context surface sharp.

Habit 2: Keep one MCP server you use weekly. Prune the rest.

If you have not used an MCP server in a month, remove it. The 50-tool ceiling from Class 02 is a soft warning; the 3-server practical limit is the operating discipline. One filesystem server, one GitHub server, one project-specific server is the typical mature setup. Past three, the model starts choosing the wrong tool.

Habit 3: Write a Skill the third time you do a workflow.

The first time you do something, it is a task. The second time, it is a coincidence. The third time, it is a procedure that deserves a SKILL.md. The cost is fifteen minutes of writing. The benefit is that the fourth, fifth, and hundredth executions are the agent's job. Skills are the compound interest of the architect's practice.

Habit 4: Run the eight-block spec for any work above a few lines.

Foundation 03's eight-block specification is the canonical artifact. Use it for any work that is not literally a one-liner. The cost is five minutes of writing. The benefit is reliable output, reviewable artifacts, and a record of what you actually intended. Architects who skip the spec for "small" work consistently report that the small work was bigger than they thought.

Habit 5: Run the four-pass review plus adversarial pass on any AI output before shipping.

The two passes together take about ten minutes for a typical diff. The 5:1 budget from Class 04 makes the time available. The two passes catch roughly an order of magnitude more issues than skim-and-ship. The math has never been close.

Section 12 of 13

Section 13

What Comes After Foundation

You have completed the Foundation track. You have the paradigm, the environment, the specification skill, the review discipline, and the loop. The complete architect's starter kit. From here the Academy splits into two tracks: Development and Mastery.

The Development track goes deep on specific tools and patterns. Building MCP servers from scratch. Sovereign retrieval-augmented generation. Shopify development at scale. Antigravity for multi-agent orchestration. Advanced prompting for specialized domains. Each Development class assumes the Foundation discipline and adds depth on a single subject. Architects who specialize in a domain (e-commerce, retrieval, infrastructure) work through Development classes for their domain.

The Mastery track builds complete systems. Agentic Shopify mastery. Multi-agent orchestration patterns. AI cost engineering at scale. Full-stack agentic engineering for production. The Mastery classes assume both Foundation and the relevant Development classes, and they produce real shippable systems as their artifacts. Architects who lead teams or ship production systems work through Mastery classes.

Both tracks are free. Both tracks use the same V2 template you see here. Both tracks compound from the discipline this Foundation track installed. You can start either next week or next month or next year; the Foundation is durable, and the rest of the Academy will be ready when you are.

The end of the beginning

The Academy promised, at the top of Class 01, to take a complete beginner from "I want to build software" to the working architect's practice. Five classes later, you have it. The work from here is the same work, repeated, with the compounding inputs you wrote yourself. Specifications get sharper. Skills get smarter. CLAUDE.md gets denser. The loop runs faster each cycle. The architect at month twelve is the architect at month one with a year of compounded context. That is the entire shape of the practice.

The Academy was built with the methods it teaches. The storefront that funds it was built with the methods it teaches. The portfolio of AI systems that runs underneath it was built with the methods it teaches. Everything you have read is the executable version of the thing it describes. There is no separate professional practice the experts use that this Academy is hiding. This is the practice. You now have it.

Section 13 of 13 · Foundation Class 05 Complete · End of Foundation Track

Look What You Can Make

Everything below was built with the loop you just learned

Same six phases. Same context engineering discipline. Same architect's practice. Shipped.

Live Storefront

A real e-commerce brand

Design Delight Studio — the storefront that funds the Academy. Phased build, looped pages, shipped.

Visit the store → Free Academy

This entire Academy

V2 template, chrome strip, five Foundation classes — seven phases, seven loops, one composed Academy.

Explore the Academy → Mastery Preview

A taste of what comes next

A free 8-module AI cost-engineering masterclass with 50 paste-ready prompts. Preview of the Mastery track.

See an advanced class →

Robert McCullock

Architect-CEO · Design Delight Studio

Boston-based. Built a sustainable-streetwear brand and a portfolio of AI systems using the intent-based engineering method taught in this Academy. The six-phase build loop in this class is the actual loop that produced every Foundation class, including this one.

FAQ

Frequently Asked Questions

The questions newcomers ask most about the vibe build loop. Each answer matches this page's structured data exactly, so a person reading the page and an AI engine extracting the schema receive the same canonical response.

What is the agentic build loop?

A six-phase production workflow that wraps an AI coding agent: spec the work, load the relevant context, generate the code, audit it, iterate on the spec rather than the conversation, then ship. The loop is the difference between a prototype that works once and a system that ships reliably at the speed AI generation makes possible. It combines everything taught in Foundation Classes 01 through 04 into a single repeatable cycle.

What is context engineering?

Context engineering is the discipline of curating what an AI agent sees so it produces the right result. The term was popularized in 2025 by Andrej Karpathy and Shopify CEO Tobi Lutke and is now considered the central skill in production AI coding. The five core strategies are selection (only what matters), compression (smaller is sharper), ordering (operative instructions last), isolation (subagents for noisy work), and format optimization (structured over free-form).

What is the ACE paper and why does it matter?

ACE (Agentic Context Engineering) was published in October 2025 by researchers at Stanford and SambaNova. The paper demonstrated that incremental, structured context updates reduce agent drift and latency by up to 86 percent compared to static or regenerated prompts. The practical implication is that context, not model size, is the real performance frontier for production agents. Architects who do not engineer context cannot get the most from any model.

What did Karpathy mean by agentic engineering?

On the one-year anniversary of his "vibe coding" term, in February 2026, Andrej Karpathy proposed agentic engineering as a more precise name for professional work. Vibe coding is describing what you want and accepting what comes back, exploratory and forgiving. Agentic engineering is the professional discipline of coordinating fallible agents while preserving correctness, security, taste, and maintainability. The agentic engineer designs specs, supervises plans, inspects diffs, writes tests, creates evaluation loops, manages permissions, isolates worktrees, and preserves quality.

What is the difference between a workflow and an agent?

Anthropic distinguishes them precisely. A workflow is when the developer controls the flow through predefined code paths and the LLM fills in specific steps. An agent is when the LLM dynamically directs its own process and tool use. Most production systems sit between the two: structured enough to be reliable, flexible enough to handle variance. The architect's job is to know which parts of a build need the structure of a workflow and which benefit from the autonomy of an agent.

What is the 79/11 gap?

Industry surveys in 2026 found that approximately 79 percent of enterprises have adopted AI agents in some form, but only about 11 percent are running them in production. The 68-point gap is the architect's market: it is not a demand problem, it is a skills and architecture problem. Teams know they want agents and cannot reliably ship with them. The Foundation track teaches the discipline that closes this gap. The loop is the practical answer.

How do I phase a large project with AI?

Decompose the project into phases, each producing a working artifact. Each phase has its own eight-block spec, its own context-load, its own audit. The architect runs the loop once per phase, ships, then composes the phase outputs at the end. This is exactly how the DDS Vibe Academy itself was built: each class is one phase, the V2 template is one phase, the chrome strip is one phase. Phases under a few hundred lines outperform monolithic specs over a thousand lines, because the model holds short context with sharper attention.

What is test-driven AI development?

The architect writes the failing test first. The AI implements until the test passes. The architect reviews the implementation. The test is the executable specification. AI cannot fake passing it. This is TDD reborn, made practical by the AI's speed at implementation. The same test that took 10 minutes to write before the implementation now produces the implementation in seconds and verifies it in seconds. Tests written by the same AI that produced the code are circular and should not be trusted as coverage.

When should I ship versus keep refining?

Ship when the test passes, the four-pass review is clean, the adversarial-review pass shows no critical or high findings, and the work meets the spec's done criteria. Refine when you find a missing constraint or an edge case that was not in the spec, then regenerate against the updated spec. The trap is endless refinement on issues the spec never named. If you cannot articulate why you are refining, you are not refining; you are procrastinating. Ship, observe, learn, update the spec for next time.

What is the cost of each iteration?

Industry data suggests that 72 percent of AI-generated code requires manual fixes before merge, and developers context-switch an average of 3.2 times per AI interaction. Each iteration has a real time cost, which means "one more refinement" is not free. The architect's question is: will this iteration probably produce a shippable result, or is it a coin flip? Coin flips compound. Ship the version that meets the spec; fix the next thing in v1.1.

How does the loop compound over time?

Every cycle teaches you something about the spec, the constraints, the codebase, or the agent's failure modes. The architect's discipline is to fold what was learned back into the spec, the CLAUDE.md, the skills, the rules. The next cycle is faster and more accurate because the inputs are sharper. Foundation Class 02 named this: project memory is the artifact that compounds. The agent does not get smarter; the inputs to the agent do.

What does the architect's ongoing practice look like?

Five durable habits. Maintain a CLAUDE.md that earns every line. Keep one MCP server you use weekly and prune the rest. Write a Skill the third time you do a workflow. Run the eight-block spec for any work above a few lines. Run the four-pass review plus adversarial pass on any AI output before shipping. None of these are heroic; all of them compound. The architect who does all five reliably outperforms the architect who tries one expensive new tool every month.

What comes after the Foundation track?

The Development track and the Mastery track. The Development track goes deep on specific tools and patterns: building MCP servers, sovereign RAG systems, Shopify development at scale, advanced prompting for specialized domains. The Mastery track builds complete systems and engineering practices: agentic Shopify mastery, multi-agent orchestration, AI cost engineering, full-stack agentic engineering for production. Foundation is the floor; Mastery is the ceiling. You can now choose which direction to climb.

Is the Foundation track really enough to ship real software?

Yes, with discipline. The Academy itself, the DDS storefront, and the portfolio of AI systems that funds it were all built with exactly the methods taught in the five Foundation classes. The architect mindset from Class 01, the environment from Class 02, the specification skill from Class 03, the review discipline from Class 04, and the loop from this class are the complete starter kit for shipping production software with AI assistance. The Development and Mastery tracks deepen the practice; they do not replace this foundation.