OWASP LLM Top 10: A Field Guide for Builders
AI Security
The OWASP LLM Top 10 is not a checklist, it is a map of the failure modes teams keep rediscovering. We translate LLM01 through LLM10 into concrete controls, code patterns, and review questions that builders can apply to their stack before the first user prompt lands.
By Arjun Raghavan, Security & Systems Lead, BIPI · July 2, 2023 · 11 min read
Every LLM application we audit fails in a way that is already documented. The OWASP LLM Top 10 captures these recurring patterns, and treating it as a field guide rather than a compliance artifact changes how teams ship.
Why a Top 10 for LLMs at all
Classic web AppSec assumes a clear boundary between code and data. LLM apps blur that line. Instructions, context, and user input share the same channel, so traditional input validation does not map cleanly. OWASP LLM01 through LLM10 names the new failure surface.
LLM01 Prompt Injection
The model cannot distinguish a system instruction from a user payload once both are in the context window. Defense in depth means treating every external string as untrusted, segmenting tool calls behind allow lists, and using output filters that reject responses outside the expected schema.
LLM02 Insecure Output Handling
Treat model output like user input. If the response feeds a shell, a SQL query, or a browser, encode and sandbox accordingly. We have seen teams render markdown from a chatbot directly into the DOM, opening a stored XSS path through the assistant.
LLM03 Training Data Poisoning
Fine tuning corpora and RAG indexes are both training data in practice. Pin sources, sign artifacts, and review pull requests to the knowledge base with the same rigor as production code.
- Hash and sign every dataset shard before training
- Reject documents that fail provenance checks at index time
- Track contributor identity for every chunk in the vector store
LLM04 through LLM06: Denial, Supply Chain, Disclosure
Token floods, malicious model weights pulled from public hubs, and sensitive data leaking through completions all fall here. Rate limit per identity, verify model hashes via Sigstore, and run a DLP pass on responses before they leave the boundary.
LLM07 through LLM10: Plugins, Agency, Overreliance, Theft
Excessive agency is the failure we see most in agentic systems. A model that can read email, send email, and approve refunds will eventually do all three in one turn. Scope tools narrowly, require human approval for irreversible actions, and log every call.
A review checklist that fits on one page
- Which inputs cross trust boundaries, and where do they merge with system prompts
- What tools can the model invoke, and what is the blast radius of each
- How is model output validated before it reaches a user, a database, or another service
- Where do model weights and embeddings come from, and are they signed
- What logs would let us replay an incident end to end
Tooling that helps
NeMo Guardrails, LangChain output parsers, and LlamaIndex node post processors give you somewhere to hang the controls. Weights and Biases or MLflow track the evaluation runs that prove a control still works after a prompt change.
If the OWASP LLM Top 10 reads like a list of incidents you have already had, you are normal. The goal is to stop having them twice.
Where teams get stuck
Most stalls happen at LLM08, Excessive Agency. Product wants the agent to do more, security wants the agent to do less, and neither side has a shared vocabulary for what the agent is allowed to do. Start there, write the allow list, and the rest of the program organizes itself.
Closing
The Top 10 is not a ceiling. Treat it as the minimum vocabulary your team needs before the next prompt change ships.
Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.