Who Holds the Pen? Why AI Agents Need a Manager, Not Just a Manual
Today, in our rush to kill backlogs and modernize services, we are liquidating the human loop. We see a win when New Jersey uses AI to sign up 100,000 kids for summer benefits, or when an agency uses a chatbot to deflect thousands of calls. These are genuinely good outcomes. One hundred thousand children got access to food assistance they were already eligible for but couldn't easily claim. That matters.
But good outcomes alone do not constitute architecture.
We are grafting AI agents onto forty-year-old policy frameworks at the exact moment those frameworks are under enormous pressure to change. With H.R. 1 bringing massive new complexities to benefits administration—shifting work requirements, frequent verifications, and evolving eligibility definitions—the stakes are compounding.
When eligibility rules change quarterly, systems that can't show their work become actively dangerous. When we used paper forms, we could blow the whole thing up and iterate because the friction was visible.
The Problem of Invisible Discretion
Consider credit scores. Whether you're approved for a loan involves data collection, scoring, review, and disclosure. Under the Fair Credit Reporting Act (FCRA), if you are denied credit, you are entitled to understand why. Legal frameworks exist because we decided as a society that people deserve explanations when decisions affect their lives.
In the agentic era, none of this exists by default. Academic researchers & experts educate people, standards bodies like NIST are doing admirable work, and advocacy groups like AANOW have raised alarms about algorithmic harm for years. But inside government agencies, on the vendor side, and among policy teams, incentives pull in different directions. Everyone believes they are acting in good faith. This is precisely how blind spots form.
There are stories almost daily of software deployed to clear a backlog, only to discover an assumption buried in the stack that causes harm at scale. Late last year, about 325,000 Californians needed new Real IDs because of a software error regarding residency verification. That's one example we caught. What happens when the error is in how eligibility rules for Medicaid or SNAP are interpreted under new H.R. 1 provisions? How long before we notice and who gets hurt before it’s discovered?
The "Hot Coffee" Moment for AI
At some point, there will be an AI-era equivalent of Liebeck v. McDonald's Restaurants, the infamous "Hot Coffee" case.
McDonald's had received over 700 reports of burns and settled for $500,000 previously, yet their quality control managers testified that the number of injuries was insufficient to change their practices. The "Hot Coffee" moment happened because an institution ignored a documented pattern of harm in favor of operational efficiency.
We are currently building that same scenario in AI, by prioritizing deflection and throughput while ignoring the 700 "burns" happening in the edge cases.
What does burn #247 look like? A benefits application was denied because the AI interpreted "household size" using census definitions rather than program-specific rules, and the applicant never knew why. What does burn #498 look like? A Medicaid renewal flagged as fraudulent because the system couldn't reconcile two legitimate addresses during a custody arrangement. What does burn #698 look like? A veteran's disability claim was routed to the wrong review queue because the AI extracted service dates incorrectly from a scanned DD-214, adding six months to an already backlogged process.
The errors compound silently until someone with resources pushes back, and by then, thousands of people have already been affected. Consequential systems need designed authority, enforced escalation, and receipts that prove who held the pen.
What’s missing in all of these cases isn’t better intent or smarter models. It’s the absence of shared infrastructure for how authority works once decisions are delegated to machines. We have names for access control, orchestration, and compliance. We don’t have names for how discretion, escalation, and signing authority are supposed to function when an AI system proposes an action instead of a person.I propose decision engineering, judgment routers, and decision receipts as frameworks for addressing these gaps.
Decision Engineering: Making Authority Explicit
Decision Engineering is the practice of mapping institutional authority and discretion into explicit, maintainable structures. When a clerk reviewed forms and passed edge cases to a supervisor, that was decision engineering. It was informal, human, and slow, but it was visible and traceable. We need that same clarity when AI systems process requests: documented, inspectable, and authorized.
Decision engineering is a practice that improves how existing roles interact with AI deployment.Procurement officers learn to write technical requirements that specify authority boundaries. Designers learn to surface judgment points, make authority legible in interfaces, and shape how escalation and review are experienced by humans.Product teams learn to map policy intent into system constraints. Engineers learn to build systems that degrade gracefully when they hit the edge of their delegation.
Judgment Routers: Infrastructure for Authorized Agency
Judgment routers sit between autonomous agents and human authority, routing decisions rather than tasks. Unlike traditional orchestrators (which route tasks) or access control (which grants permissions), judgment routers route decisions with full audit trails.
How it works:
The Router evaluates the Agent's proposed action across four core signals:
UNCERTAINTY: Data ambiguity or conflicting requirement logic.
STAKES: Budget impact or downstream risk of being wrong.
AUTHORITY: Whether this action requires sign-off above current delegation.
NOVELTY: A familiar pattern vs. a first-of-its-kind scenario.
Based on these signals, the system determines the path:
FAST PATH: Low risk and uncertainty. Execute immediately.
HUMAN ESCALATION: High stakes or authority gap. Queue for human review via a Decision Package.
When an agent hits its limit, the system degrades gracefully from autonomous execution back to human triage. It reverts to the queue rather than hallucinating compliance or failing silently. NIST AI RMF says "maintain human oversight of AI systems." The judgment router is one practical way you can enforce that requirement in production.
Right now, compliance in AI deployment is mostly narrative. You write documents explaining your governance approach, you check boxes, you submit for ATO. But when the system is live and making thousands of decisions daily, how do you prove ongoing compliance? Decision receipts give you continuous evidence rather than point-in-time documentation.
Decision Receipts: The Audit Trail and the Feedback Loop
The final output is a Decision Receipt, a structured artifact that links human authority directly to agent action.
This receipt provides institutional memory. Six months later, when someone asks why an applicant was approved or denied, you have a trail that shows which rules were applied, who authorized those rules, and what signals the system evaluated.
But the receipt does more than create a forensic record. When a novelty score triggers escalation, that decision package goes to a human reviewer. The reviewer's decision—approve, deny, or modify—feeds back into the authority model. If the same pattern appears again, the system knows how to handle it. The receipt becomes instructional, not just archival.
This feedback loop solves the learning problem. AI systems improve through exposure to edge cases, but without a mechanism to capture human judgment on those cases, the system can't learn what the organization actually wants. Decision receipts close that loop.
What Procurement Should Demand
If judgment routing is parallel infrastructure that makes compliance enforceable, then procurement teams need to know what to ask for. Right now, agencies buy AI tools from vendors who promise compliance but deliver black boxes.
A judgment-router-compatible system should provide:
Explicit delegation boundaries: The vendor's system must document what decisions it can make autonomously and what requires escalation. This should be configurable, not hardcoded.
Routing signal exposure: The system must expose uncertainty, stakes, authority, and novelty scores for every decision. These can't be proprietary black box calculations.
Decision receipt generation: Every consequential action must produce a structured artifact linking the decision to institutional authority. The receipt format should be standardable across vendors.
Human review integration: When the system escalates a decision, it must package the context in a format human reviewers can actually evaluate. No "the AI flagged this as risky" without showing the underlying signals.
Feedback loop support: The system must allow human decisions on escalated cases to update the authority model. If a supervisor approves something the AI thought required director-level sign-off, that judgment should refine future routing.
An RFP for benefits eligibility automation should require these capabilities as technical specifications, not nice-to-haves. Vendors who can't demonstrate judgment routing infrastructure can't credibly claim their systems maintain human oversight.
Every assumption baked into these systems becomes harder to find, question, and to fix. The hot coffee moment is coming. Whether we'll have infrastructure to understand what happened and fix it, or spend years reverse engineering decisions from systems we can no longer interrogate will be decided based on what we do at this critical moment.