AI-Augmented Delivery

When the agent does the work, who signs?

AI now drafts the report, flags the risk, and moves the ticket. None of that changes who owns the outcome. The accountability boundary is the one design decision that determines whether AI makes your program safer — or just faster on the way to the cliff.

Loy O'Kelley · Program Director · 7 min read · Published June 10, 2026

Every AI conversation on a program eventually arrives at the same question, usually asked sideways: if the agent produced it, who's responsible for it? The sponsor asks it about the status report. Legal asks it about the requirements traceability. The team asks it, quietly, about the estimate the model generated that everyone now treats as a commitment.

The answer hasn't changed in the entire history of delegated work, and it didn't change when the delegate stopped being human: the person who signs owns it. The Army taught me this long before AI did. A platoon leader doesn't get to tell the battalion commander that the bad grid coordinate came from a private. The work travels up; the accountability doesn't travel down. An agent is the most junior member of your team — tireless, fast, occasionally brilliant, and completely indifferent to consequences. You can delegate the labor. You cannot delegate the signature.

Where programs get this wrong

The failure mode isn't dramatic. Nobody announces that accountability has gone missing. It erodes in three quiet steps.

1. The unread draft becomes the record

Week one, the AI-drafted status goes out after a careful read. Week six, it goes out after a skim. Week twelve, it goes out. The draft was never wrong enough to force attention, so attention stopped being paid — and the program now has an official record that no accountable human has actually verified. When something in it is eventually wrong, the discovery happens in front of the steering committee.

2. The model's confidence becomes the team's confidence

Language models produce fluent, structured, assured prose whether they're right or wrong. On a program, fluency reads as diligence. An estimate with three confident paragraphs behind it gets less scrutiny than a number scrawled by a nervous engineer — and deserves more. I've written before about using AI to surface risk earlier; the same machinery, unowned, manufactures false assurance at scale.

3. Decision rights blur into automation

It starts with the agent moving tickets. Then re-sequencing the sprint. Then deprioritizing a dependency that — it turns out — a regulator cared about. No one decided to hand over that decision. It was absorbed, one harmless automation at a time. Gartner's projection that AI absorbs 80% of project-management tasks is plausible precisely because tasks transfer easily. Decisions shouldn't.

The principle

Draw the line explicitly: agents produce inputs to decisions, humans make decisions, and every artifact that leaves the program carries a named human owner who can defend it without mentioning the tool.

The accountability boundary, in practice

On the programs I run, the boundary is written down, the way decision rights always should have been. It fits on one page. Every recurring artifact — status, risk log, estimate, release note — has a named owner, and the owner's name appears on the artifact, not the tool's. Anything an agent produces that will be seen outside the team passes through its owner, and "I reviewed it" means the owner can answer questions about it cold. Agents get write access to drafts and read access to everything; they get no unilateral authority over scope, sequence, commitments, or communications to anyone who can cancel the program. And when an agent-produced artifact turns out wrong, the retrospective question is never "why did the model err" — models err; that's the operating assumption — but "why didn't the review catch it."

Notice what this isn't. It isn't a brake on adoption, and it isn't ceremony. The teams that draw this line move faster, because review is cheap when it's designed in and ruinous when it's retrofitted after an incident. The discipline is the thing that makes the speed safe — the same reason automated reporting only works when it's wired to tell the truth.

How to actually do this

Put a named human owner on every artifact that leaves the team. The tool's name never appears where the owner's should.
Write the decision-rights line down: agents draft and detect; humans decide scope, sequence, commitments, and external communications.
Make review real. If the owner can't defend the artifact without citing the tool, it wasn't reviewed.
Audit the drift quarterly. List every decision agents have quietly absorbed since the last check, and reclaim the ones that matter.
Run retrospectives on the review, not the model. The model erring is weather; the review missing it is process.

The bottom line

AI hasn't changed what accountability is. It has changed how quickly you can lose track of it. The programs that get durable value from agents aren't the ones that automate the most — they're the ones where, for any artifact you point at, someone with a name and a stake will say that one's mine. Build that boundary before the first incident, and AI makes your program genuinely safer. Skip it, and you've built a machine for producing unowned mistakes at unprecedented speed.

Projection from Gartner via ITPro. Views are my own.

Where programs get this wrong

1. The unread draft becomes the record

2. The model's confidence becomes the team's confidence

3. Decision rights blur into automation

The accountability boundary, in practice

The bottom line

Want AI on your program without losing the chain of accountability?