Human-in-the-Loop Approvals for AI Agents: When and How to Use Them

Human-in-the-loop (HIL) approvals are explicit human sign-offs that an AI agent must obtain before it can complete a defined high-risk action. The runtime triggers an elicitation protocol, blocks the tool call until a designated human approves or denies it, captures the response as a signed attestation, and writes the decision to a tamper-evident audit log. Human-in-the-loop is one of the most effective controls in an agent governance program. It is also one of the easiest to misuse, either by approving everything (which trains humans to rubber-stamp) or by approving nothing (which slows the agent to a stop).

This guide explains when HIL is the right control, when it is not, how to implement it through tool-level policy, and how to capture approvals in a format that audit evidence regulators will accept.

What is human-in-the-loop for AI agents?

Human-in-the-loop for AI agents is a control that pauses an agent's execution before a defined tool call completes, surfaces the action to a designated human for review, and only proceeds after explicit approval. The approval is captured as a signed attestation tied to the agent, the artifact running, the policy version in effect, and the audit log.

HIL is not the same as an "are you sure?" prompt to the user who started the agent. It is a structured approval, often by a different human than the requester, on actions defined ahead of time by clearly written policy.

When to require human-in-the-loop approval

HIL is a powerful but expensive control: every approval has latency, requires human attention, and costs cycles. The right rule of thumb is to require HIL where the cost of an unrecoverable mistake is much higher than the cost of a five-minute review.

Use HIL for:

Money movement above a threshold. Wire transfers, refunds, payments, vendor approvals, or any transaction with material financial impact.
Destructive operations. Database deletes, schema changes, repository force-pushes, infrastructure teardowns, mass record updates.
External communications on behalf of a person. Emails sent from an executive's account, customer outreach, regulatory filings.
Access to highly sensitive data. PHI lookups, source code retrieval from restricted repos, classified or CUI access.
Procurement and supplier decisions. Contract signing, purchase order approval above a threshold, vendor onboarding.
Production deployments. Pushing a model to production, rolling back a service, modifying a load balancer or feature flag for a large user base.
Anything explicitly listed in a compliance framework that requires human oversight. EU AI Act human oversight provisions, FDA review for clinical decision support, regulatory disclosures.

Do not use HIL for:

Routine, low-risk actions the agent was hired to do. Approving every read query trains the approver to stop reading.
Actions where the human approver has no real information to make a decision. If the approver always says yes, HIL is theater.
Cases where the right control is denial, not approval. If an action should never happen, write a policy that denies it. Do not punt the decision to a human under time pressure.

Why HIL matters in 2026

Three forces are pushing organizations toward formal HIL for agents.

Agents are taking actions with consequences. It is no longer hypothetical that an agent might wire money, send an email from an executive, or delete a database. Real deployments have produced real near-misses.

Regulators expect human oversight. EU AI Act explicitly requires human oversight for high-risk AI. NIST AI RMF and HIPAA expect demonstrable approval workflows for actions affecting individuals' rights or data. CMMC Level 2/3 expects approval evidence for sensitive actions in defense contexts.

Documented incidents include consent failures. A procurement agent was manipulated to approve $5M in false purchase orders. A GitHub MCP server prompt injection caused private repo content to leak into public PRs. In each case, an explicit human checkpoint would have changed the outcome.

HIL turns "the agent decided" into "a named human decided, with evidence."

How HIL fits into tool-level access control

HIL is a property of tool policy. A working tool policy for a high-risk tool looks something like this:

Tool	Argument condition	Default action	Approval required from
`payments.transfer`	amount ≥ $10K	Block until approved	Finance approver group
`payments.transfer`	amount < $10K	Allow with rate limiting	None
`database.delete`	table in production schema	Block until approved	DBA group
`email.send`	from = executive mailbox	Block until approved	Executive ops
`repos.force_push`	branch in protected list	Block until approved	Release engineering

When the agent attempts the tool call, the runtime evaluates the policy, finds an approval requirement, triggers the elicitation protocol, and pauses execution. A notification goes to the approver group (Slack, email, dedicated app, paging system, depending on urgency). The approver reviews context (agent identity, prompt that led to the call, arguments, business justification supplied by the agent or upstream system) and approves or denies.

The approval (or denial) is captured as a signed attestation that includes:

The approver's identity
The exact tool call and arguments approved
The agent identity and artifact digest
The policy version in effect
A timestamp
A reference to the audit log entry

The runtime then either completes the call or surfaces the denial back to the agent (which can adjust, escalate, or stop).

Step-by-step: how to implement HIL for AI agents

List your high-risk tools. Most teams find their list has 5 to 15 tools that warrant HIL. Start there.
Define approval thresholds for each tool. Not every tool call needs review. Identify the arguments that flip an action from routine to high-risk (amount, table, recipient, environment).
Choose approver groups by tool type. Finance approves money movement. DBAs approve destructive database operations. Release engineers approve production changes. Avoid one universal approver group; it dilutes attention.
Encode the rules in tool policy. Use a shared policy language (YAML + CEL is common) so the same rules can be versioned, signed, and distributed.
Pick approval channels by urgency. Most approvals can flow through Slack or email. Truly urgent approvals (during an active incident) need a paging path with documented SLAs.
Wire approvals into the audit log. Every approval and denial must be captured as a signed attestation, chained into the tamper-evident log, and queryable by compliance.
Set a fallback policy. If no approver responds inside the SLA window, the runtime must do something defined: deny by default, escalate, or notify. Defaulting to allow is not acceptable for high-risk actions.
Review HIL data on a regular cadence. Look at approval rates, time-to-approval, denial reasons, and which tools generate the most reviews. Use that data to refine policy: tighten where approvers always say yes, loosen where approvers always say no.

Common HIL mistakes

Approving everything. If approvers say yes 100% of the time, the control has become a rubber stamp. Tune the policy so approval requests carry real signal.
One universal approver group. Asking the same five people to approve finance, database, deployment, and customer communications actions guarantees they will stop paying attention.
No SLA and no fallback. Approval requests that sit indefinitely either block legitimate work or train approvers to act in haste. Define an SLA and a clear fallback action.
Approvals captured in chat, not as attestations. A Slack thread is not audit evidence. The approval must be signed, chained, and queryable.
Letting the agent self-approve. Some implementations let the agent generate its own justification and pass it forward. Without a separate human, this is not HIL.
HIL where denial is the right answer. If a tool should never be called under certain conditions, encode that as a deny, not as a human review. HIL is for ambiguous high-risk cases, not for cases the policy already knows the answer to.

How HIL maps to compliance

Framework	What HIL supports
EU AI Act	Explicit human oversight requirement for high-risk AI systems
NIST AI RMF	"Manage" function: documented controls for actions with significant consequences
HIPAA	Authorization workflows for access to PHI
CMMC Level 2/3	Approval evidence for sensitive operations in defense environments
SR 11-7	Documented promotion and operational approvals for model-driven actions
21 CFR Part 11 / GxP	Cryptographic record integrity for actions affecting regulated data
SOX	Approval workflows for actions with material financial impact

Captured as a signed attestation in a tamper-evident chain, HIL becomes audit evidence rather than a Slack thread someone has to dig up later.

How Jozu Agent Guard handles HIL

Jozu Agent Guard makes HIL a first-class part of tool policy.

ToolPolicy defines which tool invocations trigger an elicitation. Conditions can reference any argument, the calling agent's identity, the artifact digest, the policy version, or external context.
Elicitation protocol pauses agent execution at the moment of the tool call, surfaces the request to the designated approver group, and waits for an explicit response.
Signed attestation captures the approver, the exact tool call, the agent identity, the artifact digest, the policy version, and a timestamp. The attestation is cryptographically chained into the audit log.
Fail-closed defaults. If approval times out or evaluation errors, Agent Guard denies the call rather than allowing it.
Works offline. In air-gapped or DDIL environments, approvals can be captured locally and synced when connectivity is restored, with no remote dependency for the enforcement decision itself.

The result is HIL that is both operationally usable and audit-grade.

Explore Jozu Agent Guard →
Book a demo →

Frequently asked questions

What is the difference between human-in-the-loop and a confirmation prompt?
A confirmation prompt asks the user who started the agent to confirm. HIL routes the request to a designated approver group, often a different human, on a defined policy, and captures the response as a signed attestation. The first is a UI pattern; the second is a governance control.

Should every agent action require HIL?
No. Requiring approval on every action floods the approver group, trains them to ignore requests, and undercuts the productivity gain agents provide. HIL is for actions where the cost of an unrecoverable mistake is much higher than the cost of a review.

Who should be the approver?
The approver should be a human with the authority and context to evaluate the specific action. Finance for money movement. DBAs for destructive database operations. Executive ops for outbound communications from a leader's mailbox. One universal approver group is an anti-pattern.

How fast can approvals happen?
Routine HIL flows through Slack or email and takes seconds to minutes. Time-sensitive HIL (during incidents, for example) needs a paging path with documented SLAs and a defined fallback action if no approver responds.

Does HIL slow agents down?
Only for the actions HIL covers, which should be a small fraction of total agent activity. Tuning policy to require approval only where it adds value keeps the latency cost low and the safety value high.

Can HIL be captured in Slack or email instead of a signed attestation?
For audit purposes, no. A Slack message is not tamper-evident, not cryptographically chained, and not directly tied to the artifact and policy version in effect at the time. Use Slack or email as the human interface, but capture the resulting decision as a signed attestation in the audit log.

What happens if no one approves?
The runtime must do something defined by policy: deny the call (typical for genuinely high-risk actions), escalate to a fallback approver, or notify and hold. Defaulting to allow under timeout is not acceptable for actions that warranted HIL in the first place.

Does HIL work in air-gapped environments?
Yes, when the runtime supports it. Approvals are captured locally inside the disconnected environment, written to the local tamper-evident log, and synced upstream when connectivity is restored. The enforcement decision should never depend on a remote service.

Related reading:

See HIL in a working runtime. Explore Jozu Agent Guard or book a demo.

Share this post