Treating agent output as untrusted input

The agent is not a trusted operator

Most agent tooling assumes the agent knows what it's doing. It executes whatever the agent sends and surfaces errors after the fact. Wrong model.

Agents hallucinate. They invent parameter names, pass strings where numbers are expected, reference slide indices that don't exist. They embed control characters that break OOXML (Office Open XML) and construct file paths that traverse directories. These aren't edge cases. They happen regularly, especially when the agent is working near the boundaries of its context.

The right model is to treat agent output as untrusted input, with the same rigor you'd apply to a public API request from an unknown client. In agent-slides, this means:

Typed schemas with extra="forbid", so a typo like fon_size is caught at deserialization, not three operations later when the render fails
Transactional execution: if operation #7 fails, #1-6 are rolled back
Path traversal checks and control character rejection
Structured error responses with the operation index, error code, and a suggested fix

The error responses matter more than you'd think. A Python traceback tells the agent "something went wrong." A response like {"detail": "SLIDE_INDEX_OUT_OF_RANGE: slide_index=5 but deck has 3 slides", "suggestion": "Use slide_index between 0 and 2"} tells the agent what happened and how to fix it. The agent can parse the error, adjust its payload, and retry. Validation failures become feedback, not dead ends.

There's also a dry-run mode. The agent can validate every operation against the current deck state without writing to disk. If the dry run passes, the real run will pass too. If it doesn't, the agent gets the same structured errors without wasting a render cycle.

Design contracts, not design taste

A human presentation author has taste. They know that 8pt body text is unreadable, that six fonts on one slide looks chaotic, that white text on a pale background is invisible. They absorbed these rules unconsciously over years of seeing good and bad presentations.

Agents don't have taste. They optimize for whatever the prompt asks for. If the prompt says "fit this content on one slide," the agent will shrink text to 7pt, overlap shapes, and use any color that's technically valid. Nothing in the model's training makes it viscerally dislike bad typography.

You can put rules in the prompt, but prompt rules are suggestions. The agent follows them most of the time, but under pressure (tight content, complex layouts, long context), it cuts corners. Silently.

A design profile solves this by externalizing the rules into a machine-readable contract. Minimum and maximum font sizes per element type. Allowed colors. Minimum contrast ratios. Maximum overlap thresholds. The lint engine checks every shape in the rendered deck against these constraints and reports violations as structured JSON.

Think of it like ESLint for slides. ESLint doesn't prevent you from writing bad JavaScript, but it catches patterns that experienced developers know to avoid. A design profile encodes what "professional-looking slide" means in a contract that agents are held to, not asked to follow.

Treating agent output as untrusted input.

The agent is not a trusted operator

Design contracts, not design taste