AI-Augmented Delivery

The delivery stack behind an AI-augmented program

Not a tool list. A working philosophy: the model thinks, the automation moves, and a human owns the boundary between them. Here's what I actually run, and why.

Loy O'Kelley · Program Director · 8 min read · Published May 31, 2026

People ask me which AI tool to buy. It's the wrong question, and the wrong question costs them six months. A stack isn't a tool — it's a division of labor. On a program, I think about three layers: the layer that reasons, the layer that moves work between systems, and the human layer that decides what's allowed to happen without asking. Get the layers right and the specific products are almost interchangeable. Get them wrong and the best model in the world just helps you make mistakes faster.

Layer one: the reasoning layer

This is the language model — for me, primarily Claude, with Microsoft Copilot where the work lives inside the Microsoft 365 estate a client already runs on. This layer does the thinking that used to eat a delivery lead's week: reading the full RAID log and telling me what changed, drafting the three-audience status from the system of record, turning a messy stakeholder thread into a decision and an owner, stress-testing a plan by arguing the other side.

What it does not do is touch anything that matters on its own. The reasoning layer proposes. It never disposes. That distinction is the whole safety model, and most failed AI rollouts are a failure to hold it.

Layer two: the automation layer

Reasoning is useless if a human still has to copy the output into five systems by hand. The automation layer is the plumbing that moves work: Power Automate inside the Microsoft estate, and Make or n8n when I need flexible, low-friction orchestration across tools that don't natively talk — Jira to Confluence to email to a dashboard. n8n in particular earns its place when a client wants the automation self-hosted for data-control reasons, which in government and financial-services work is most of the time.

This is where the 20–40% reductions in coordination overhead actually come from. Not from the model being clever — from the work moving without a human shuttling it. The model decides the weekly status is ready; the automation routes it, files it, and posts it to the right channel at the right altitude.

The division of labor

The model thinks. The automation moves. The human owns the boundary. Every tool slots into one of those three jobs. If a tool is doing two of them unsupervised, that's not a stack — that's an incident waiting to happen.

Layer three: the layer everyone skips

The third layer isn't a product. It's the governance that decides what the first two layers are allowed to do without a human in the loop — and it's the reason Gartner expects 40%+ of agentic AI projects to be cancelled by the end of 2027. The teams that fail didn't pick the wrong model. They wired a confident automation into a regulated process with no decision rights, no RACI for exceptions, and no audit trail, and then acted surprised when it did exactly what they told it to.

Before an agent touches anything that's regulated or revenue-impacting, three things exist in writing: who holds the human decision right, who owns the exception when the agent is wrong, and where the audit trail lives. In a Medicaid platform overhaul or a financial-services migration, that scaffolding isn't bureaucracy — it's the difference between automation and an unowned liability. I learned this discipline in the Army, where putting an autonomous capability into the field without defined control was never a productivity question. It was a safety one. Delivery is the same; the stakes are just denominated in dollars and compliance findings instead.

How to actually build it

The mistake is buying the whole stack on day one. I build it the way I'd clear a route — one confirmed step at a time:

Start with the reasoning layer, read-only

Point the model at your systems of record with no write access. Let it draft status, summarize risk, surface deltas. Zero blast radius, immediate time savings. You learn where it's reliable before you give it any reach.

Add automation only where the path is proven

Automate the boring, deterministic moves first — routing a report, updating a dashboard, filing a summary. Keep the model's output in front of a human until the pattern has earned trust.

Govern before you go autonomous

Only let the stack act without a human after the decision rights, exception ownership, and audit trail are written down and tested. Data quality first — an AI-ready foundation is the first thing the 2026 PMO playbooks tell you to fix, because agents inherit the quality of their inputs.

The stack, in one screen

Reasoning: Claude, Microsoft Copilot — reads, drafts, analyzes. Proposes, never disposes.
Automation: Power Automate, Make, n8n — moves work between systems so no human shuttles it by hand.
Governance: human decision rights, RACI for exceptions, audit trail — written before anything runs unsupervised.
Sequence: read-only reasoning first, proven automation second, autonomy last and only with controls.

The point

The tools will change. Two years from now half the products in this article will have better replacements, and it won't matter, because the architecture holds: something reasons, something moves, and a human owns the line between proposing and doing. That's not an AI strategy. It's how you run a disciplined program that happens to use AI. The discipline is the part clients are actually paying for. The tools are just the instruments I run it on.

Figures from TechPlusTrends, Epicflow, and Cora Systems. Tool references reflect my own working stack; views are my own.

Layer one: the reasoning layer

Layer two: the automation layer

Layer three: the layer everyone skips

How to actually build it

Start with the reasoning layer, read-only

Add automation only where the path is proven

Govern before you go autonomous

The point

Want this stack stood up in your program?