On the Loop - Leading AI Is an Empowerment Problem
Kief Morris's agentic flywheel, published on Martin Fowler's site, lands almost word for word on the same principles Marty Cagan and Reed Hastings have been preaching for decades. Leading AI agents isn't a new discipline. It's empowered product management with a new kind of teammate.
On this page
Kief Morris published a piece on Martin Fowler’s site at the start of March titled Humans and Agents in Software Engineering Loops. It is the best short framework I have read for where the human should stand when agents are doing the work. If you have ever tried to explain to a sceptical CTO why “I still review every line” is not a viable position in eighteen months, this is the piece to hand them.
What struck me reading it twice is not that the framework is novel. It is that the framework is almost identical to the one Marty Cagan and Reed Hastings have been preaching to product leaders for twenty years. The right way to lead an AI agent turns out to be the right way to lead a senior engineer. The hard-won lessons of empowered product teams, Netflix’s context not control, and Daniel Pink’s Drive all transfer directly. If you have them, leading AI is a change of medium, not a change of discipline. If you don’t, the AI era will expose that faster than any reorg ever did.
This is the piece I wish my vibe-coding-in-prod article from earlier this week had been paired with. Eric Schluntz gave us the engineering-individual-contributor frame: be Claude’s PM. Morris gives us the team and org frame: be on the loop, not in it. Together they describe a single shift - from doer to empowerer - that applies at every level of the organisation.
Morris’s Five Positions
Morris’s core move is to reject the binary. The debate in most teams is framed as “do we let agents ship code, or do we review every diff?” That framing is a trap. He lays out five distinct positions of human involvement, only one of which is actually productive today:
| Position | What the human does | What it feels like |
|---|---|---|
| Outside the loop | Sets the product direction, lets agents do everything downstream | Pure vibe coding; only works for throwaway or contained work |
| In the loop | Inspects each artifact the agent produces | Micromanagement; humans become the bottleneck |
| On the loop | Builds and maintains the harness - specs, quality checks, workflow rules | The productive zone today |
| Agents manage the harness | Humans direct agents that are improving the harness itself | Emerging frontier |
| The agentic flywheel | Agents analyse loop performance against richer signals (metrics, user journeys, commercial outcomes) and propose improvements, some auto-approved | Self-improving system, still engineered |
The two positions everyone defaults to - in the loop and outside the loop - are the two that don’t work. In the loop recreates the original bottleneck: the agent can generate a week of code in an hour, and the human pretending to review it is either lying or holding everything up. Outside the loop is the YOLO failure mode: you lose internal quality, which turns out to matter precisely because it compounds into external outcomes. Morris’s line is sharp: “clean codebases help agents work faster; spiralling on messy code wastes time and resources.”
The productive zone is on the loop. Stop fixing individual artifacts. Start improving the system that produces them. This is pure shift-left thinking applied to agents: instead of inspecting every PR, you invest in specifications, evaluation criteria, architectural rules, and review gates that catch issues at the source. The harness becomes the leverage point.
Why This Reads Exactly Like EMPOWERED
Here is the part that made me sit up. Substitute team for agent and manager for human, and Morris’s framework becomes a straight restatement of Marty Cagan’s EMPOWERED.
Cagan has been saying for two decades that the difference between a feature team and an empowered team is where the manager stands. A feature team manager sits in the loop: reviewing every decision, approving every solution, acting as a bottleneck and, worse, as a single point of creative failure. An empowered team manager stands on the loop: setting strategic context through vision, strategy, team objectives, and architectural guardrails, then getting out of the way while the team figures out how to solve the problem. His phrase for this, which I have been repeating in every leadership conversation I have had for years, is the same one Reed Hastings uses at Netflix: lead with context, not control.
The deep parallel with Morris’s framework is this:
- In the loop = feature-team management. You are the bottleneck, you are the single point of failure, and you produce mediocre outcomes because you are the least informed person in the room about the actual work.
- On the loop = empowered-team management. You build the conditions under which high-quality work happens without you in the critical path. You review outcomes, not artifacts. You coach and you set the context, you do not drive.
- The flywheel = a mature empowered organisation. The team doesn’t just execute the harness; they improve it. They bring data back, propose changes, and the system gets better without the manager’s constant intervention. In human terms, this is what Hastings calls high talent density with candor and context. In agentic terms, it is Morris’s flywheel.
The mechanics are identical. The hard part - for managers leading humans and for engineers leading agents - is the same one Cagan keeps hammering. You have to give up the dopamine of being the smartest person in the room. You have to trust the system you built. That trust scales. Your personal review cycles do not.
No Rules Rules Is the Operating Manual
Reed Hastings’s No Rules Rules is the book I happened to read two weeks ago, and wrote about here. The line between that review and this article is embarrassingly short.
Hastings’s three levers - talent density, candor, and removing controls - are exactly the three levers that make Morris’s on-the-loop position work. Map them:
- Talent density → capable agents. Netflix fires adequate performers because one adequate teammate drags the whole team down. The AI analogue is that capable models make every other leadership decision easier. A team on Claude Sonnet 4.6 or Opus 4.7 can be trusted with larger chunks of the how loop than a team on last year’s models. Low talent density in humans forces control; low capability in agents forces micromanagement. Both are solvable, and both are prerequisites, not nice-to-haves.
- Candor → evaluation signals. Netflix’s rule is that it is disloyal to withhold disagreement. The agentic equivalent is that the harness must have honest, fast signal: unit tests, integration tests, evals, production metrics, commercial outcomes. If your evaluation system flatters the agent, you get the same result as a team where no one says what they really think. Confident-looking mediocrity that nobody challenges.
- Removing controls → shrinking the inspection surface. Netflix removed vacation tracking, expense approvals, and decision-making approvals because the controls were costing more than they saved. On the loop for agents says the same thing: stop inspecting every diff, stop gating every PR on a human read-through. Replace gates with guardrails. Replace review with design.
And then the principle that connects everything: lead with context, not control. Hastings says it about Netflix. Cagan says it about product teams. Morris says it about agents, though he doesn’t use that phrase. It is the same rule.
Drive: Autonomy, Mastery, Purpose - Applied to Agents
Daniel Pink’s Drive is the pop-psych classic in this corner of leadership thinking. His three intrinsic motivators - autonomy, mastery, purpose - are the reason context-not-control works on humans. Agents don’t have motivation in any meaningful sense, but the framework still describes what a well-designed harness gives them:
- Autonomy maps to scope and latitude. A well-scoped task with clear boundaries and explicit out-of-scope lines lets the agent make real choices within a defined space. Over-specified prompts are the agentic version of micromanaging a senior engineer into mediocrity.
- Mastery maps to progressive disclosure and skills. Agents get better at a task when they have access to the right procedural knowledge at the right time. That is what the skills framework encodes. Mastery for humans is career growth; mastery for agents is the right skill loading into context at the right moment.
- Purpose maps to clear outcomes and the why loop. Morris’s two-loop model explicitly separates the why loop (the outcome you want) from the how loop (the artifacts that deliver it). Purpose lives in the why loop. An agent that knows it is here to reduce payment-failure rate by 10% behaves differently, and makes better trade-offs, than an agent that is told to “implement the retry handler.”
The human parallel matters because it predicts where AI teams will fail. The same way human teams fail when they are given a task but not a purpose, agents fail when they are given a prompt but not an outcome. The same way human teams fail under a manager who inspects every commit, agents fail under an operator who can’t stop reading every diff. The failure modes are isomorphic, and so are the fixes.
What the Harness Actually Is
Morris uses harness as an abstract term. In concrete terms, the harness of an empowered agentic team is a stack of things I have been writing about for months. Each layer is one of the places the human invests instead of reviewing code:
| Harness layer | Concrete artifacts | Written up here |
|---|---|---|
| Product purpose | Vision, strategy, target outcomes, OKRs, problem statements | Insights-driven product strategy, Set expectations right |
| Architectural context | CLAUDE.md, project docs, conventions, non-negotiables | The perfect CLAUDE.md |
| Procedural discipline | Skills, slash commands, phase gates, anti-rationalisation | The skills framework |
| Specifications | Specs, task breakdowns, plan artifacts | PLAID in practice |
| Verification | Tests, evals, stress tests, metric dashboards, user journeys | Implied across the stack |
| Workflow infra | Worktrees, agent loops, headless runs, CI gates | Agentic development patterns, The overnight agent factory |
Notice that exactly none of these things are “read the agent’s diff.” That is the point. The humanly-scarce asset is the quality of the harness, not the quantity of review. Every hour invested in a better CLAUDE.md, a sharper spec, a tighter test suite, or a more honest metric pays for itself across every agent run afterwards. Every hour spent reading a diff pays for itself once, maybe.
This is exactly the “stop touching code, start building the system that writes good code” message that senior engineering leaders have been trying to internalise for thirty years. The AI era simply makes the cost of being in the loop visible in a way it never was before.
The Flywheel Is OKRs for Agents
Morris’s most interesting move comes at the end, where he describes the agentic flywheel. Agents analyse the performance of the loop against richer and richer signals - not just unit tests, but pipeline metrics, user journeys, commercial outcomes - and propose harness improvements with risk, cost, and benefit scores. High-confidence recommendations can be auto-approved. The system improves itself, while remaining engineered.
Strip the AI vocabulary and this is the product operating model every empowered company aspires to run on. You set outcomes. You instrument the product. You review the data. You propose changes scored by expected impact. You ship the high-confidence ones, experiment with the medium ones, and discuss the risky ones in a room with the people who will own them.
What Morris is describing is OKRs for agents. A flywheel where the harness evolves from production signal the way a good product team evolves strategy from insights. This is why the executives reading his piece should not think of it as an AI infrastructure piece. They should think of it as the opportunity to apply, at last, the same discipline to code production that we have spent two decades trying to apply to product decisions.
The teams that will win the next two years are the ones that already run this way on the product side. Empowered teams with honest metrics will not find the agentic flywheel foreign. They will find it trivial, because the muscle is already there.
The CPTO Mandate Is Changing, Not Disappearing
I have been sitting with what this means for my own role, given I am about to join a fast-growing London startup as a CPTO. The short answer is that the job is changing in the same direction every good CPTO role should have been pointing anyway. Further from the artifacts, closer to the system.
In concrete terms:
- Your product vision and strategy matter more, not less. Agents with a bad spec produce bad code faster than humans ever could. If you cannot articulate the outcome in a way the harness can transmit to the agent, you are not a leader. You are a bottleneck.
- Your engineering standards matter more, not less. CLAUDE.md is the new architecture document. Skills are the new engineering handbook. Evals are the new code review. If these are sloppy, the agents are sloppy. If these are rigorous, the agents are rigorous.
- Your team’s talent density matters more, not less. Not because agents replace engineers, but because the agentic flywheel amplifies whatever level of taste and rigor the team starts with. High talent density with agents is a cheat code. Low talent density with agents is a confident-sounding disaster at three times the speed.
The CPTOs who struggle will be the ones who try to migrate their existing micromanagement habits into the AI era. “I’ll just review every agent PR like I used to review every senior engineer’s PR.” That works for a month. It falls apart the moment the team scales past a single human-in-the-loop, which in an agentic company is about eight weeks in.
The CPTOs who thrive will be the ones who treated empowered as an operating principle, not a slogan, before any of this. For them the transition is mechanical: the same harness they were already building for their humans now extends to the agents the humans orchestrate.
Three Moves I Am Making Tomorrow
Writing this article changed three things in my own planning for the next engagement. Concrete, not abstract:
- Invest in the harness before the first feature. My first two weeks at the new company will be CLAUDE.md, skills, evals, and architectural guardrails, before anyone writes a production feature with agents. Building the harness is day-zero work, the way hiring the first senior engineer used to be day-zero work.
- Treat evaluations like metrics, not tests. Move eval data into the same dashboards as product metrics. If the team sees “agent success rate on payment flow tasks” next to “payment failure rate,” they will naturally invest in the first because it moves the second. That is the flywheel.
- Stop reading diffs where a skill would do the job. Every time I catch myself about to review something, I am going to ask “is this review reproducible as a rule, a test, or a skill?” If yes, I encode it once and never do the review again. If no, it was genuinely a judgment call and deserves my attention. The ratio of the former to the latter is ten to one.
The Takeaway
The instinct that leading AI is a new discipline is wrong. The managers who were good at the old one - setting context, empowering teams, leading with outcomes, removing their own bottleneck from the critical path - are the ones who will be good at this one. The ones who hid behind artifact review as their value-add are about to find out they were never adding the value they thought.
Kief Morris’s framework is a rigorous, engineering-specific version of the empowerment playbook Cagan, Hastings, and Pink have been writing for twenty years. The fact that it arrives in March 2026 as “the agentic flywheel” rather than “empowered engineering teams” is a branding accident. The substance is the same.
Read Morris’s piece. Then re-read EMPOWERED and No Rules Rules with agents in mind. If you are a CPTO, VP Eng, or founder making calls about how your team uses AI this year, those three sources contain most of what you need. The rest is just doing it.
References
- Humans and Agents in Software Engineering Loops - Kief Morris - The source article on Martin Fowler’s site
- EMPOWERED - Marty Cagan (book notes) - Empowered product teams, context-not-control, coaching
- No Rules Rules - Reed Hastings (book review) - Talent density, candor, and the operating manual for high-autonomy teams
- Vibe Coding in Prod Is a Management Problem - The individual-contributor frame that pairs with this one
- The Skills Framework - The procedural discipline layer inside the harness
- The Perfect CLAUDE.md - The architectural context layer inside the harness
- Agentic Development Patterns - The daily workflow that sits on top
- The Overnight Agent Factory - The compounding mechanism
- Product Operating Model - What empowered teams look like when the operating model is right
Related articles
Vibe Coding in Prod Is a Management Problem, Not a Coding Problem
Eric Schluntz (Anthropic) argues the real challenge of AI-assisted engineering is the oldest one in management - how do you verify work you can't read yourself? My take on leaf nodes, the 22,000-line PR, and why being Claude's PM is the new engineering craft.
The Last Bottleneck - Brains, Bodies, and Memory
One person can now build what used to take a team of fifty. Here's the tectonic shift - and what I'm already seeing in my own work with AI agents, orchestration frameworks, and persistent memory layers.
The Skills Framework - From Vibe Coding to Production-Grade Agentic Engineering
Why Anthropic's Agent Skills and Addy Osmani's skills framework are the missing discipline layer for serious AI-assisted software engineering - and how they compare to GitHub's Spec Kit.