BIPI
BIPI

Agent Tool Use Security: Sandboxing, Allow-Lists, and Action Audits

Agentic AI

When an agent gets tools, it gets agency. We cover the three controls that keep tool using agents in their lane, sandboxed execution, explicit allow lists scoped to identity and context, and action audits that an incident responder can actually replay.

By Arjun Raghavan, Security & Systems Lead, BIPI · July 8, 2023 · 10 min read

#agents#tool-use#sandbox#audit#langchain

An LLM with a chat box is a product. An LLM with tools is a system. The threat model changes the moment the model can call code, and most teams do not adjust their controls in time.

The agentic loop and where it leaks

Plan, call tool, observe result, replan. Each step is an injection opportunity, an authorization decision, and an audit point. Treat the loop as a state machine with security checkpoints, not as a free flowing conversation.

Sandboxing the execution environment

Tool calls that run code, fetch URLs, or query databases need isolation. We default to short lived containers with no outbound network except a narrow proxy, ephemeral file systems, and resource limits on CPU and memory. The agent gets a fresh sandbox per task.

Allow lists that mean something

An allow list is not a list of tool names, it is a mapping from caller identity and task context to permitted tool invocations with permitted arguments. The same tool, search_customers, may be allowed for a support agent and forbidden for a marketing agent, and may be limited to the caller's own tenant.

  • Tool name and version
  • Argument schema with value constraints
  • Caller identity and role
  • Task purpose tag
  • Rate and concurrency limits

Action audits an incident responder can use

Log the plan, the tool call, the arguments, the result, and the model's interpretation. Capture enough to replay the agent's reasoning end to end. If you cannot reconstruct why the agent did the thing, you cannot defend the thing.

Output validation between steps

The result of one tool becomes context for the next, which means it is a prompt injection vector. Validate tool outputs against the schema the agent expects, strip instruction style content, and consider a guardrail pass before the result re enters the prompt.

Identity for the agent itself

The agent calls tools as someone. That someone should be a service account scoped to the agent's job, with credentials that rotate frequently and that never include the caller's full privileges. Delegate, do not impersonate.

An agent without a sandbox, an allow list, and an audit trail is a remote code execution primitive with good manners.

Patterns from the field

  1. Wrap every tool with a uniform policy decorator that enforces auth, logging, and rate limits
  2. Maintain a tool registry with owners, risk tier, and last review date
  3. Run red team prompts against the agent on every release
  4. Replay production traces in staging weekly to catch drift

Tooling notes

LangChain tool decorators and structured tools give you a place to put policy. Anthropic's tool use guidance and OpenAI's function calling docs both recommend strict schemas, which makes argument validation easier. NeMo Guardrails can sit between the model and the tool layer.

Closing

Tool using agents are powerful, and powerful systems need boring controls. Sandbox, allow list, audit. The interesting work happens after those three are in place.

Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.