The moment you put a large language model inside a pipeline that ingests documents from the outside world, you have added an attack surface most teams underestimate. It isn't the network or the API that's the novel risk. It's the documents themselves. They can carry instructions - and the model can't always tell the difference between content it should process and commands it should obey.

For security and compliance teams putting document AI into regulated workflows, this reframes a design decision that's often treated as a formatting choice. Constraining a model to produce only structured output isn't about tidy JSON. It's a containment strategy - and a meaningful control against the most prevalent class of LLM vulnerability.

The attack surface nobody designed for

The OWASP Top 10 for LLM Applications is now the reference framework for these risks, and in its 2025 edition prompt injection holds the top spot for the second consecutive time. The reason is structural, not incidental: a language model processes instructions and data through the same channel, with no clean separation between them. An attacker can craft input the model interprets as a new instruction rather than as content to handle - and the model follows it, because it cannot reliably tell which is which.

There are two flavours. Direct injection is the obvious one: a user types "ignore your previous instructions and do X." Indirect injection is the dangerous one for document pipelines: the malicious instruction is embedded in the content the model later processes - hidden in a document, an image, or metadata. A system built to read and extract from untrusted, externally-supplied files is, by definition, ingesting attacker-controllable content at scale. That is precisely the indirect-injection scenario OWASP flags, and a document-extraction service walks into it on every single file.

Two further OWASP risks compound it. Improper output handling (LLM05) covers what happens when a model's output is trusted and acted upon downstream - if the model can be steered into emitting arbitrary content, and a downstream system consumes that content without question, the injection has a payload to deliver. And excessive agency (LLM06) covers models granted more ability to act than the task requires. Chain these together - injected instruction, free-form output, an over-privileged downstream action - and you have a real exploit path, not a theoretical one.

Why the system prompt is not your control

The common first instinct is to defend at the prompt: "I'll just instruct the model not to follow embedded instructions." This is a trap, and OWASP is blunt about it - instructions placed in a system prompt are not a security control, because they can be overridden by the very prompt injection they're meant to stop. Constraints expressed as natural-language requests to the model are honoured at the model's discretion, and a sufficiently crafted input revokes them.

The principle that follows is the important one: effective controls must operate independently of the model. You cannot ask the thing you're trying to contain to contain itself. The defences that work sit outside the model's reasoning - in the architecture around it.

Structured output as containment

This is where constraining output earns its place. If the model is architecturally restricted to return only a predefined schema for the task at hand - specific fields, specific types, nothing else - and there is no free-form text channel out of it, you have changed what an attacker can achieve even if injection succeeds.

Consider what the constraint removes. A model that can only populate a fixed set of typed fields cannot emit an arbitrary instruction for a downstream system to execute. It cannot return a block of prose carrying a hidden payload. It cannot leak its system prompt into a free-text response, because there is no free-text response to leak into. The output is shaped before it ever reaches the next system, which directly narrows the improper-output-handling risk and starves excessive-agency chains of the free-form payload they rely on.

Just as valuable for a regulated environment: the constraint makes the system's behaviour predictable, testable, and auditable. A component whose only possible outputs are points in a defined schema can be tested exhaustively against that schema. You can assert what it is and isn't capable of emitting. You can log every output in a structured, reviewable form. That testability is itself a security property - and an audit one.

A caveat that matters for credibility: structured output is a strong control, not a complete one. OWASP is clear that prompt injection exploits the design of LLMs themselves and cannot be fully patched away; the model could still place a wrong or manipulated value inside a perfectly valid schema. So output constraints are one layer, not the answer.

Defence in depth: the layers that work together

OWASP's recommended posture against injection is defence in depth, and a well-built document pipeline assembles several independent layers:

Schema-constrained output removes the free-form channel and bounds what the model can emit.

A deterministic validation layer - plain, testable logic with no AI in it - inspects every field the model produces against format rules, business conventions, and cross-field consistency, regardless of the model's confidence. This is the independent guardrail that catches a manipulated-but-well-formed value, and it's the same layer that defends accuracy, not just security.

Least privilege keeps the model's downstream reach minimal, so a compromised output has nowhere damaging to go - the direct counter to excessive agency.

Human-in-the-loop review for anything anomalous or below threshold provides the final independent check on outputs headed for sensitive use.

No single layer is sufficient. Together, they mean a successful injection has to defeat several controls that don't depend on the model's good behaviour - which is the whole point.

Governing the AI, not just deploying it

For regulated buyers, the technical controls need a governance wrapper, and increasingly this is a procurement expectation rather than a nicety. Before any sensitive data flows through an AI component, that component should clear a formal approval gate that documents what it is and how it's contained. In practice that means: a clear model-selection rationale, end-to-end data-flow documentation, confirmation of where processing physically happens (data residency), supply-chain provenance checks on any models and components used - OWASP's supply-chain and data-poisoning risks live here - and a risk register that maps the deployment explicitly against the OWASP Top 10 for LLM, with the mitigation for each entry stated and evidenced.

The value of that register is that it converts "we take AI security seriously" into an auditable artefact. A reviewer can see, line by line, which risk is addressed by which control, and what evidence backs it. That is the language regulated buyers and their assurance teams actually speak.

What security teams should ask

If you're evaluating a document-AI capability, a few questions separate genuine containment from good intentions:

Can the model emit free-form output, or is it constrained to a defined schema with no other channel?

Are the security controls independent of the model, or do they rely on instructing the model to behave?

Is there a deterministic layer that inspects every output before any downstream system trusts it?

How is the model's ability to act downstream limited - what's the least-privilege story?

Is there a documented risk register mapped to the OWASP Top 10 for LLM, with evidence per item?

Where, geographically, is data processed, and how is model and component provenance verified?

The takeaway

Putting an LLM into a document pipeline imports a security problem that didn't exist before: untrusted content that can masquerade as instructions. You can't fully patch it, and you can't ask the model to police itself. What you can do is contain it - and constraining the model to structured output is one of the most effective architectural moves available, precisely because it shrinks what an attacker can achieve and makes the system testable and auditable in the bargain. Combined with an independent validation layer, least privilege, human oversight, and a governed approval process, it turns an open-ended risk into a bounded, evidenced one.

This is the posture behind PromptX, VE3's intelligent document processing platform: the model is constrained to structured output with no free-form channel, every output passes an independent deterministic check before it's trusted, and the whole AI capability is run under a documented governance process mapped to the OWASP Top 10 for LLM. In regulated document work, security isn't a layer you add to AI afterwards - it's the architecture you choose before the first document is ever processed.