In recent months, there's been significant buzz around Generative UI - the concept of using AI to generate entire user interfaces from natural language descriptions. While this approach shows promise for certain use cases, it may not be the optimal solution for productivity tools and applications where users need consistent, familiar interfaces they can return to daily.
Enter AI Augmented UI - a hybrid approach that maintains the benefits of traditional UI design while incorporating the power of large language models to enhance user interaction.
The Problem with Pure Generative UI
Generative UI is fascinating. The ability to describe an interface in natural language and have it materialize before your eyes feels like science fiction becoming reality. However, this approach has some fundamental limitations:
- Lack of Consistency: Generated UIs can vary between sessions, making it difficult for users to build muscle memory
- Learning Curve: Users need to learn how to effectively prompt for the UI they want
- Limited Complex Interactions: Some sophisticated UI patterns are difficult to describe and generate reliably
- Productivity Impact: For frequently used tools, having to regenerate or describe the UI each time can slow users down
Introducing AI Augmented UI
Instead of generating entire interfaces from scratch, what if we could enhance existing, well-designed UIs with natural language controls? This is the core idea behind AI Augmented UI - a pattern that allows users to interact with familiar interfaces using natural language, while maintaining the structure and predictability of traditional UI design.
The concept is simple:
- Start with a well-designed, traditional UI component
- Wrap it with an AI layer that understands the component's capabilities
- Allow users to control the component through natural language
- Maintain the original UI's interactive elements for traditional use
Here's the prototype of this I did in React, while building out our Agent Cloud product.
Implementation: Building an AI-Enhanced Component
Naming things is hard - and when prototyping - just roll with it. So I called it LLMEnhanced
for now. It's a wrapper component that adds a natural language interface while preserving all the original functionality.
Using the Component
Here's a real-world example of enhancing a data table with this AI Augmented UI pattern:
<LLMEnhanced
schema={{
properties: {
filters: {
status: ['failed', 'success', 'pending'],
agent: ['agent-1', 'agent-2', 'agent-3'],
},
search: {
type: 'string',
description: 'Search text to filter sessions',
},
},
}}
>
<DataTable
// props here...
/>
</LLMEnhanced>
In this example, users can:
Filters and search are props that exist on the DataTable
component. The LLMEnhanced
wrapper can manipulate these props based on the user's input. For example:
- Filter by status: "Show me failed sessions"
- Filter by agent: "Display sessions from agent-1"
- Search: "Find sessions containing '123'"
- Mix and match: "Is agent-1 still running?"
The LLMEnhanced Component
Here's AI augmentation wrapper:
interface LLMEnhancedProps {
children: ReactElement;
schema: Record<string, unknown>;
}
import { ReactElement, useState, cloneElement } from 'react';
import { Popover, PopoverContent, PopoverTrigger } from './popover';
import { Sparkles } from 'lucide-react';
// Define types for the component props
interface LLMEnhancedProps {
children: ReactElement;
schema: Record<string, unknown>;
}
export function LLMEnhanced({ children, schema }: LLMEnhancedProps) {
const [enhancedProps, setEnhancedProps] = useState({});
const [isLoading, setIsLoading] = useState(false);
const [isOpen, setIsOpen] = useState(false);
const handleNaturalLanguageInput = async (input: string) => {
try {
setIsLoading(true);
const response = await fetch('/api/enhance', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
prompt: input,
schema,
}),
});
if (!response.ok) throw new Error('Failed to fetch');
const data = await response.json();
setEnhancedProps(data);
setIsOpen(false); // Close popover after successful request
} catch (error) {
console.error('Error processing LLM response:', error);
} finally {
setIsLoading(false);
}
};
return (
<div className="relative">
<Popover open={isOpen} onOpenChange={setIsOpen}>
<PopoverTrigger asChild>
<button
className="absolute right-2 top-2 z-10 rounded-full bg-white/10 p-1.5 text-blue-500 backdrop-blur-sm hover:bg-white/20 dark:bg-gray-800/10"
aria-label="AI Enhancement"
>
<Sparkles className="size-4 text-black dark:text-white" />
</button>
</PopoverTrigger>
<PopoverContent className="w-80">
<div className="relative">
<input
className="w-full rounded-md border border-gray-300 px-4 py-2 pr-10 focus:border-blue-500 focus:outline-none dark:border-gray-600 dark:bg-gray-800"
type="text"
placeholder="Ask anything about the data..."
onKeyDown={(e) => {
if (e.key === 'Enter') {
handleNaturalLanguageInput(
e.currentTarget.value
);
}
}}
disabled={isLoading}
/>
{isLoading && (
<div className="absolute right-3 top-2.5">
<div className="h-5 w-5 animate-spin rounded-full border-2 border-gray-300 border-t-blue-500" />
</div>
)}
</div>
</PopoverContent>
</Popover>
{cloneElement(children, {
...children.props,
...enhancedProps,
})}
</div>
);
}
Obviously this can be improved greatly - but for the proof of concept, it works and it really is this simple. BTW - this is using Shadcn's UI components in case you want to copy/paste - you'll need those components in your project.
The API Route
I'm using Next.js and the app/api
directory. This is a simplified version of the API route:
import { generateObject } from 'ai';
import { openai } from '@ai-sdk/openai';
export const runtime = 'edge';
export async function POST(req: Request) {
const { prompt, schema } = await req.json();
try {
const results = await generateObject({
model: openai('gpt-4o'),
// TODO: Need to figure out how to do a dynamic zod schema from
output: 'no-schema',
prompt: `You are a helpful assistant that converts natural language UI requests into component props.
Return a JSON object with properties that match the provided schema.
Only include properties that are relevant to the user's request.
Given this component schema: ${JSON.stringify(schema, null, 2)}
Convert this request into props: "${prompt}"`,
});
return results.toJsonResponse();
} catch (error) {
console.error('Error generating object:', error);
return Response.json(
{ error: 'Failed to process request' },
{ status: 500 }
);
}
}
That's it! I will call out that the Vercel AI SDK has a schema validator (used with Zod) to validate what's coming out of the LLM. In the case above, that schema needs to be dynamic, based on the props we want to control on the component. However, since it's a proof of concept, I'm using that as an excuse for being lazy and not doing that.
Benefits of this Approach
- Preserves UI Consistency: The base UI remains stable and familiar, with AI as an enhancement rather than a replacement
- Progressive Enhancement: Users can still interact with the UI traditionally when preferred
- Scoped Intelligence: The AI understands specific component capabilities, leading to more reliable results
- Flexible Implementation: Can be applied to any existing component with minimal changes
What's Next?
The current implementation is just the beginning. Here are some fun possibilities I'm thinking about for our web app:
- Whole Screen Enhancement: Imagine wrapping entire screen components, allowing AI to orchestrate multiple UI elements simultaneously
- Dynamic Schema Generation: Instead of manually defining component capabilities, AI could analyze components to understand their properties automatically
- Contextual Understanding: Enhanced AI that understands not just component props, but user intent and application state. Imagine hooking this up to user's behavior and session history - and the UI consistently reflects what the user needs, without even having to ask or state what they're doing, from screen to screen.
- Collaborative AI: Multiple AI-enhanced components working together to fulfill complex user requests
What else?
I was going back and forth on what this pattern should be called. Some other ideas:
- AI Augmented UI
- AI Enhanced UI
- Agent Augmented UI
- LLM Enhanced UI
What else? I'd love to hear your ideas! Hit us up on: