A2UI: What Becomes Possible When Agents Can Orchestrate Interfaces
We’re at an inflection point in how agents interact with the world. For years, we’ve treated UI as something agents produce—a byproduct, really. They generate text, we render it. They output JSON, we parse it into buttons and forms. But what if we inverted this? What if agents didn’t just describe what they needed, but actually orchestrated the interface itself?
That’s what A2UI does. And more importantly, that’s what it enables.
Google’s A2UI isn’t just a protocol. It’s a capability multiplier. It’s the difference between an agent that can tell you what it needs and an agent that can actually show you what it needs—in real time, across devices, through trust boundaries, while maintaining security. The implications are profound, and they’re not always obvious at first glance.
The Capability Gap We’ve Been Living With
Before A2UI, agents operated under a fundamental constraint: they could describe UIs, but they couldn’t control them. This created friction at every scale.
In single-turn interactions, it meant verbose back-and-forths. You ask an agent to book a restaurant. It responds with a text paragraph asking for your date, time, and party size. You respond. It asks for dietary restrictions. You respond again. What should be a three-second interaction becomes a five-turn conversation. The agent has the capability to help you, but it lacks the interface to express that capability efficiently.
In multi-agent systems, it got worse. Imagine an orchestrator agent in your company delegating work to a specialized agent in another organization. The remote agent can’t touch your DOM. It can’t inject JavaScript. It can’t even guarantee visual consistency. So historically, we’ve either:
- Embedded iframes (heavy, slow, security nightmare)
- Passed HTML (risky, inconsistent, not portable)
- Sent text (back to the verbose problem)
None of these solutions actually solved the problem. They were workarounds. A2UI is the actual solution.
What Becomes Possible: Four Fundamental Capabilities
A2UI unlocks four fundamental capabilities that reshape how agents operate:
1. Real-Time Task Orchestration Without Conversation Overhead
Agents can now present structured interfaces that evolve as they gather information. A single JSON payload describes not just static components, but a data model that the agent can update incrementally. The user sees the interface assemble in real time—progressive rendering—while the agent continues computing.
This transforms task complexity from O(n) conversations to O(1) interface presentations. The agent doesn’t need to ask permission to refine its interface. It just sends updates.
2. Cross-Organizational Agent Meshes Without Trust Compromise
A2UI is declarative data, not executable code. The client maintains a whitelist of allowed components. The remote agent can only reference types in that catalog. This means:
- No code injection risk. The agent can’t execute arbitrary JavaScript from model output.
- Brand preservation. The host application renders components in its own style, maintaining visual consistency.
- Trust boundaries that actually work. Organizations can delegate to external agents without security nightmares.
For the first time, you can build truly distributed multi-agent systems where agents from different organizations collaborate without requiring deep trust relationships.
3. Progressive Interface Refinement During Conversation
Because A2UI uses a flat, updateable component representation (optimized for LLMs), agents can modify interfaces incrementally. They don’t regenerate a full nested JSON tree. They send delta updates.
This enables a new interaction pattern: interface morphing. An agent starts with a simple form. As you provide information, it reveals new fields, hides irrelevant ones, or completely restructures the layout based on what it learns about your intent. The interface adapts to the conversation, not the other way around.
4. Framework-Agnostic Agent Deployment
Write your agent logic once. Deploy it everywhere. A2UI payloads render on web (React, Angular, Lit), mobile (Flutter, SwiftUI), and desktop (Qt, Electron). The agent doesn’t care about the rendering substrate. It describes the logical structure, and each client maps it to native components.
This is genuinely new. It means agent capabilities aren’t bound to specific platforms. You can build sophisticated agentic applications that work seamlessly across the entire device ecosystem without platform-specific code.
Six Usage Scenarios: What This Actually Looks Like
Let’s move from abstraction to reality. Here are six scenarios that illustrate what A2UI makes possible—capabilities that simply don’t exist in today’s agent-driven systems.
Scenario 1: The Live Data Analysis Dashboard
An analyst asks an AI agent: “Show me why our Q4 revenue dipped and what we should do about it.”
Without A2UI: The agent generates a long text report with recommendations. The analyst reads it. If they want to explore a hypothesis, they ask a follow-up question. Another text response. The feedback loop is slow.
With A2UI: The agent streams a dashboard interface in real time:
┌─────────────────────────────────────────────────────┐
│ Revenue Analysis Dashboard │
├─────────────────────────────────────────────────────┤
│ │
│ [Loading: Analyzing Q4 revenue patterns...] │
│ │
│ Key Finding: │
│ ├─ Revenue down 12% YoY │
│ ├─ Primary driver: Customer churn in EMEA (28%) │
│ └─ Secondary: Reduced contract values (8%) │
│ │
│ [Chart: Revenue by region - rendering...] │
│ │
│ Drill-down options: │
│ ├─ [View EMEA churn analysis] │
│ ├─ [Compare to historical patterns] │
│ └─ [Simulate scenarios] │
│ │
│ Agent is computing: Scenario modeling... │
│ │
└─────────────────────────────────────────────────────┘As the agent analyzes deeper, it updates the interface. New charts appear. Drill-down options become available. The analyst can click to explore specific hypotheses. Each click sends an event back to the agent, which refines its analysis and updates the interface again.
The entire interaction is interface-first, not text-first. The agent orchestrates the experience, not narrates it.
Scenario 2: Multi-Organization Procurement Workflow
A purchasing agent at Company A needs to request a quote from a specialized vendor (Company B). The vendor’s system is managed by a remote agent.
Without A2UI: Company A’s agent sends a request via API. Company B’s agent responds with JSON. Company A’s agent parses it, converts it to HTML, embeds it in an iframe. Visual mismatch. Potential security concerns. The user sees a clunky embedded window.
With A2UI: Company B’s agent sends an A2UI payload describing a quote request form. Company A’s client renders it using Company A’s design system. The form looks native. It feels integrated. From the user’s perspective, it’s seamless.
User Flow:
┌──────────────┐
│ Company A │
│ Orchestrator │
│ Agent │
└──────┬───────┘
│ "I need a quote for 500 units"
▼
┌──────────────────────────────────┐
│ Company B Vendor Agent │
│ (Remote, untrusted) │
└──────┬───────────────────────────┘
│ A2UI Payload:
│ {
│ components: [
│ { type: "Form" },
│ { type: "TextField", label: "Qty" },
│ { type: "Select", label: "Delivery" },
│ { type: "Button", label: "Submit" }
│ ]
│ }
▼
┌──────────────────────────────────┐
│ Company A Client │
│ (Renders with native components) │
│ │
│ ┌────────────────────────────┐ │
│ │ Quote Request Form │ │
│ ├────────────────────────────┤ │
│ │ Quantity: [500 ] │ │
│ │ Delivery: [Standard ▼] │ │
│ │ [Submit Quote Request] │ │
│ └────────────────────────────┘ │
│ │
└──────────────────────────────────┘The form is rendered by Company A’s UI framework (React, Flutter, whatever). It’s styled consistently with Company A’s brand. It’s secure—Company B’s agent can’t inject code. And it’s seamless.
Scenario 3: Real-Time Customer Support with Intelligent Form Morphing
A customer contacts support: “I’m having issues with my payment method.”
Without A2UI: The support agent (AI or human) asks a series of clarifying questions in text. What payment method? What’s the error message? Have you tried X? Each response requires the customer to read and type.
With A2UI: The agent presents a support form that morphs based on the customer’s responses.
Initial State:
┌──────────────────────────────┐
│ Support Request │
├──────────────────────────────┤
│ Issue Type: │
│ ☐ Billing │
│ ☐ Technical │
│ ☐ Account │
│ ☐ Other │
│ │
│ [Submit] │
└──────────────────────────────┘
After selecting "Billing":
┌──────────────────────────────┐
│ Support Request │
├──────────────────────────────┤
│ Issue Type: Billing │
│ │
│ Payment Method: │
│ ☐ Credit Card │
│ ☐ Bank Account │
│ ☐ PayPal │
│ │
│ [Submit] │
└──────────────────────────────┘
After selecting "Credit Card":
┌──────────────────────────────┐
│ Support Request │
├──────────────────────────────┤
│ Issue Type: Billing │
│ Payment: Credit Card │
│ │
│ Error Message: │
│ [Text input field...] │
│ │
│ Card Last 4 Digits: │
│ [____] │
│ │
│ [Submit] │
└──────────────────────────────┘The agent doesn’t ask questions in text. It updates the form structure in real time. Each response triggers a new A2UI payload that refines the interface. The customer fills out a form that’s always asking the right question next, not a generic form.
The support agent can see the customer’s responses streaming in and adjust the form accordingly. If the agent detects a known issue pattern, it adds a “Quick Fix” button. If it needs more information, it adds fields. The interface is alive.
Scenario 4: AI-Powered Financial Advisor with Progressive Disclosure
A user asks an AI financial advisor: “Should I refinance my mortgage?”
Without A2UI: The agent generates a long text analysis. The user reads it. If they want to explore “what if” scenarios, they ask a follow-up question.
With A2UI: The agent presents a progressive interface that reveals complexity as needed.
Stage 1 (Immediate):
┌──────────────────────────────────┐
│ Refinance Analysis │
├──────────────────────────────────┤
│ │
│ Quick Answer: YES, refinance │
│ Potential savings: ~$120/month │
│ │
│ [Show detailed analysis] │
│ [Run scenarios] │
│ [Compare lenders] │
│ │
└──────────────────────────────────┘
Stage 2 (After clicking "Show detailed analysis"):
┌──────────────────────────────────┐
│ Detailed Refinance Analysis │
├──────────────────────────────────┤
│ Current Rate: 6.5% │
│ Current Balance: $450,000 │
│ Remaining Term: 20 years │
│ │
│ Recommended Rate: 5.8% │
│ Estimated Savings: $120/month │
│ Break-even: 18 months │
│ │
│ ┌──────────────────────────────┐ │
│ │ Savings Over Time (chart) │ │
│ │ [Graph rendering...] │ │
│ └──────────────────────────────┘ │
│ │
│ [Adjust assumptions] │
│ [View lender options] │
│ │
└──────────────────────────────────┘
Stage 3 (After clicking "Adjust assumptions"):
┌──────────────────────────────────┐
│ Scenario Builder │
├──────────────────────────────────┤
│ Interest Rate: [5.8%____] │
│ Term: [20 years ▼] │
│ Closing Costs: [$3,500___] │
│ │
│ ┌──────────────────────────────┐ │
│ │ Updated Savings: $98/month │ │
│ │ Break-even: 22 months │ │
│ └──────────────────────────────┘ │
│ │
│ [Reset] [Save scenario] │
│ │
└──────────────────────────────────┘The agent doesn’t dump all information at once. It presents a simple answer first, then progressively reveals complexity. The user controls the depth. Each interaction updates the interface with new data, new options, new visualizations. The agent is orchestrating a guided experience, not writing a report.
Scenario 5: Cross-Platform Agent Deployment
A logistics company builds an AI agent to optimize delivery routes. They deploy it once, and it works everywhere.
Same Agent Logic, Different Renderings:
┌─────────────────────────────────────────┐
│ Web (React) │
├─────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────┐ │
│ │ Delivery Route Optimizer │ │
│ ├──────────────────────────────────┤ │
│ │ [Interactive map with pins] │ │
│ │ [Route details sidebar] │ │
│ │ [Performance metrics dashboard] │ │
│ └──────────────────────────────────┘ │
│ │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Mobile (Flutter) │
├─────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────┐ │
│ │ Delivery Route Optimizer │ │
│ ├──────────────────────────────────┤ │
│ │ [Compact map view] │ │
│ │ [Swipeable route cards] │ │
│ │ [Quick action buttons] │ │
│ └──────────────────────────────────┘ │
│ │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Desktop (SwiftUI) │
├─────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────┐ │
│ │ Delivery Route Optimizer │ │
│ ├──────────────────────────────────┤ │
│ │ [Large map with layers] │ │
│ │ [Detailed route inspector] │ │
│ │ [Advanced analytics] │ │
│ └──────────────────────────────────┘ │
│ │
└─────────────────────────────────────────┘
All from same A2UI payload:
{
"components": [
{ "type": "Map", "data": "routes" },
{ "type": "RouteList", "data": "deliveries" },
{ "type": "MetricsPanel", "data": "performance" }
]
}The agent sends one A2UI payload. Each client renders it according to its platform’s native conventions. The web version uses React components. The mobile version uses Flutter widgets. The desktop version uses SwiftUI views. All orchestrated by the same agent logic.
This is genuinely transformative for organizations building agentic applications. You don’t need platform-specific agents. You don’t need to maintain three separate codebases. You build the agent once, and it works everywhere.
Scenario 6: Nested Agent Orchestration with UI Composition
An orchestrator agent delegates work to multiple specialized agents. Each returns an A2UI payload. The orchestrator composes them into a unified interface.
Orchestrator Agent
│
├─► Data Analyst Agent ──► A2UI Payload: Charts
├─► Recommendation Agent ──► A2UI Payload: Action Items
└─► Risk Agent ──► A2UI Payload: Risk Assessment
Orchestrator composes these into:
┌──────────────────────────────────────────┐
│ Business Decision Dashboard │
├──────────────────────────────────────────┤
│ │
│ ┌────────────────────────────────────┐ │
│ │ Data Analysis │ │
│ │ [Charts from Analyst Agent] │ │
│ └────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Recommended Actions │ │
│ │ [Items from Recommendation Agent] │ │
│ └────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Risk Assessment │ │
│ │ [Analysis from Risk Agent] │ │
│ └────────────────────────────────────┘ │
│ │
│ [Approve] [Request Changes] [Reject] │
│ │
└──────────────────────────────────────────┘This is where things get really interesting. Agents don’t just send data—they send interface descriptions. An orchestrator can weave together outputs from multiple agents into a coherent user experience. Each agent contributes its piece of the UI. The orchestrator decides how to compose them.
This enables genuinely complex multi-agent workflows where the interface is a first-class concern, not an afterthought.
The Architectural Shift
What A2UI enables is a fundamental inversion in how we think about agent-driven systems. Instead of:
Agent → Output → Parse → Render → User
We get:
Agent → Orchestrate UI → Stream Components → Render Native → User
The agent is no longer just computing outputs. It’s orchestrating the entire user experience. It decides what to show, when to show it, how to structure it, and how to refine it based on user interaction.
This is a new design paradigm. It requires thinking about agents differently:
- Agents as UI Orchestrators, not just computation engines
- Interfaces as First-Class Outputs, not byproducts
- Progressive Revelation, not information dumps
- Cross-Platform Consistency, not platform-specific implementations
- Composable Agent Outputs, enabling true multi-agent systems
What This Unlocks for the Future
A2UI is still at v0.8. It’s early. But even at this stage, it’s already being used in production systems (Opal, Gemini Enterprise, Flutter GenUI). As it matures, we’ll likely see:
-
Agentic UI Patterns Emerging - Just like web frameworks developed common patterns (MVC, component-based architecture), agentic systems will develop standard UI orchestration patterns.
-
Agent-Driven Form Builders - Agents that don’t just fill forms but actively construct and refine them in real time based on user intent.
-
True Multi-Agent UX - Not just agents passing data, but agents collaborating to build coherent user experiences.
-
Platform-Agnostic AI Applications - AI apps that work seamlessly from smartwatch to desktop without platform-specific code.
-
Emergent UI Capabilities - As LLMs get better at understanding and generating declarative UI descriptions, we’ll see increasingly sophisticated interface orchestration that we haven’t even imagined yet.
The Bigger Picture
A2UI matters not because it’s a clever technical solution (though it is). It matters because it removes a fundamental constraint from agentic systems.
For years, we’ve treated the agent-to-user boundary as a narrow pipe: text in, text out. A2UI widens that pipe. It lets agents express themselves through rich, interactive, evolving interfaces. It lets them orchestrate experiences instead of narrating them.
That’s a capability shift. And in AI, capability shifts are what enable new applications, new use cases, and new possibilities.
The protocol is simple. The implications are profound.
Getting Started
If you’re building agentic applications, A2UI is worth exploring now:
- GitHub: Check out the open-source repository for reference implementations
- Documentation: The technical details are well-documented for developers
- Quickstart Samples: Multiple examples across different frameworks (React, Flutter, etc.)
- Community: Early adopters are already building with it
The future of agent-driven interfaces isn’t text-based. It’s declarative, composable, and orchestrated. A2UI is the protocol that makes that future possible.