Supervising Autonomous Agents: Approval Workflows that Scale
Agentic AI
The hardest part of running autonomous agents in production is not building them. It is deciding which actions need a human, when to auto-pause, and how escalations flow. Patterns from three deployments, two of which earned their autonomy.
By Arjun Raghavan, Security & Systems Lead, BIPI · May 26, 2024 · 7 min read
A logistics client wanted full autonomy for their dispatch agent. After six weeks of pilot, they pulled back to human-on-the-loop on three classes of decisions. Six months later, the agent earned back full autonomy on two of those three. The third still requires approval, and probably always will. Supervision is not a binary toggle. It is a per-action calibration that evolves.
Three supervision modes, plus when to pick each
The vocabulary that travels well across teams.
- Human-in-the-loop: the agent stops and waits for explicit approval before each action of a class.
- Human-on-the-loop: the agent acts, then a human reviews. Reversible actions only.
- Fully autonomous: the agent acts and the supervision is statistical, via aggregate dashboards and sampled audits.
The picking criteria are the cost of error and the reversibility of the action. High cost, low reversibility (deleting accounts, sending money, closing tickets that lose customer state) starts in human-in-the-loop and moves down the ladder over time as evidence accumulates. Low cost, high reversibility (drafting a reply, scoring a lead) starts at human-on-the-loop and may go fully autonomous quickly.
Approval workflows that do not become a bottleneck
The risk with human-in-the-loop is that humans become the rate limiter. If approval takes 4 hours and the agent generates 200 approval requests a day, the system stalls. Three patterns make approval scale.
- Tiered approval: small, low-risk variants of the action are pre-approved up to a threshold. Approval is required only when the action exceeds the threshold (dollar amount, scope, blast radius).
- Batch approval: queue requests, present in batches with shared context, let one approver clear 20 in 10 minutes rather than 1 every 30 seconds.
- Confidence-gated approval: the agent self-rates its confidence per action. High confidence flows through, low confidence escalates. Calibration matters; we measure agent confidence against actual outcome quarterly.
On the logistics dispatch case, tiered approval cut approval volume by 78 percent without changing the risk profile. The dispatcher only saw decisions above 5000 USD impact. Below that, the agent acted and the dispatcher reviewed weekly summaries.
Auto-pause on anomaly
Even fully autonomous agents need a kill switch. Auto-pause triggers we have shipped.
- Action rate doubles compared to a 7 day baseline.
- Tool error rate exceeds a threshold within a sliding window.
- Same action targets the same entity more than N times in M minutes.
- Cost per task exceeds the historical p99.
- An external system (CRM, payments, fulfillment) raises a signal indicating downstream confusion.
When auto-pause fires, the agent stops acting and pages the on-call supervisor. The supervisor either resumes or kills. We have never had a false-positive auto-pause cause a production issue. We have had three cases where auto-pause caught a real regression before customers did. Lopsided risk math: ship the auto-pause.
Escalation as a feature
The agent should have an explicit escalate tool. Calling it should be cheap, not penalised. Good escalation tools record the agent's reasoning, the relevant context, and the proposed action, and route to the right human or queue based on the action type. We have seen agents under-escalate because the framework made escalation feel like failure. Make it a normal step, not a last resort.
Earning autonomy back
The logistics agent earned back autonomy on two action classes through a structured graduation. Three months at human-in-the-loop with statistical review of overrides. If override rate stayed below a threshold and no high-severity incidents fired, move to human-on-the-loop. Three more months. Same criteria, lower threshold. If it holds, move to fully autonomous with continuous statistical monitoring. The graduation is reversible at any time. We have triggered a downgrade twice across all our engagements; both times the regression was caught early because the dashboards were already in place.
Supervision is not a bolt-on. It is the operating model for an autonomous system. Build it deliberately, calibrate it on data, and let the agent earn trust the way you would expect any system to: by behaving well, in public, for long enough that the numbers convince the skeptics.
Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.