UI in the Loop: When to Type, When to Render

If the output fits a card or a form, it should be one. Let the agent return a tidy block I can act on or scan quickly; keep the conversation for context and why. Small components, strict props, no surprises.

Artificial Intelligence
September 16, 2025

Here’s how I want it to work. Ask me questions in chat, then hand me a small UI when the shape is clear, whether to act or just to see it clean. That switch makes the product feel fast.

The point

Chat is great for fuzzy questions. It is bad for predictable tasks. If the result already looks like a card, a table, a form, a picker, or a confirmation, use UI. “UI in the loop” is the moment the agent switches from text to interface. Same conversation, less friction.

What I mean by “UI in the loop”

The agent plans as usual. Before sending, it decides: text, UI, or hybrid.
If it picks UI, it emits a declarative spec: components and props from a small, boring palette.
The host app validates the spec, renders, and routes events (click, change, submit) back to the agent.
If anything is unsafe or malformed, the host falls back to plain text.

I am not asking the model to write React. It fills a schema that maps to a fixed set of components.

The napkin sketch

The agent with a UI router. Use a few blunt rules to decide if UI cuts steps: the answer has a stable visual shape; we’re listing comparable items; we need structured input; or we’re asking for confirmation. If any apply, prefer UI. If the user says “explain”, switch to text.
The palette. Keep a tiny allow list that covers most flows: Card, List, Table, Image, Button, Tabs, Dialog, Form, Select, Range, Rating, Pagination, Sheet. Strict props. No custom code. No CSS from the model. Theme comes from the host.
The host renderer. Validate the spec against a schema, map entries to real components, and wire events back to the agent. Handle attribution, domains, and safe links. If validation fails, show a graceful text fallback.

That is it. A switch, a palette, a renderer.

Where UI beats text for agents

Lists with actions. Products, posts, flights, jobs. Sorting and filtering in the UI, not by rewriting a paragraph.
Short forms and wizards. Collect two fields and a date. Show a summary. Confirm.
Comparisons. A table with highlighted differences is faster than pages of prose.
Scheduling. Time slots, constraints, and a confirm button.
Triage. Email or ticket queues with quick actions.

If the user asks “why” or “how”, return to text. The switch is cheap.

Trust and branding without chaos: a small manifest file

If an agent shows content from a site, that site should have a say in how it appears. The web already has patterns like robots.txt, security.txt, and oEmbed discovery. We can use the same idea for agent UIs.

Call it /.well-known/agent-ui.json.

What it does:

Lets a site publish brand tokens and display rules for how its content should look inside agent UI blocks.
Tells clients what components and props are acceptable for that domain, and which actions are allowed (View, Compare, Save). Buying or posting on behalf of the user should require an explicit policy.
Provides image and link hygiene: allowed CDNs, max image size, link requirements, attribution.

The client app stays in control. Users can opt in or out per site or globally. If the manifest is missing or blocked, render with a neutral theme.

How the decision gets made

Rules I like:

If the answer is a set of similar items, return a List or Table.
If the user needs to pick or confirm, return a Form or Dialog.
If the user wants to learn or reason, return text.
If the user flips between explore and explain, let them switch the mode.
Keep UI blocks small and scoped. One job at a time.

Anti-patterns

Making the model write UI code. It will be brittle and unsafe.
Dumping a dashboard into chat. Keep it tight.
Letting unknown domains inject theme or scripts.
UI that traps the user. Always offer a text path.

Why now

Agent chat has matured enough that the friction points are obvious. People expect to click, not copy. Design systems make small palettes cheap. Well-known discovery patterns exist. We can add UI to the loop without inventing a new web.

For now, keep the rule simple: if the answer has a shape or needs input, render UI. If not, talk.