Ever felt that pause... that moment of waiting while your AI agent processes a complex request? Yeah,
it's annoying! So of course we've added streaming to Agentuity.
What is Streaming, and Why Should You Care?
Instead of waiting for the entire AI response to be generated before sending it back, streaming delivers the response piece by piece, often word by word, as it's created. This unlocks several key benefits:
Vastly Improved User Experience: Users see activity immediately, making the interaction feel much more dynamic and responsive, like a real conversation.
Enhanced Perceived Performance: Even if the total generation time is the same, seeing results instantly makes the agent feel significantly faster.
Better Handling of Long Outputs: For agents generating lengthy text, code, or reports, users can start reading or reviewing the beginning of the response while the rest is still being generated.
Why Would My Agent Need to Stream?
Streaming is particularly powerful for:
Chatbots & Conversational AI: Makes dialogue flow naturally.
Live Code Generation/Explanation: Users see code appear and explained in real-time.
Real-time Data Summarization/Analysis: Get insights as they are processed.
Interactive Storytelling or Content Creation: Build suspense or see creative text unfold live.
Getting Started with Streaming on Agentuity: Simplicity is Key
We designed our streaming support to be incredibly easy to integrate. Our AgentResponse object now features a resp.stream() method that accepts any standard ReadableStream.
How simple can it be? If you're using popular libraries like the Vercel AI SDK, enabling streaming is often trivial:
import type { AgentRequest, AgentResponse, AgentContext } from "@agentuity/sdk";
import { streamText } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
export default async function Agent(
req: AgentRequest,
resp: AgentResponse,
ctx: AgentContext,
) {
const res = streamText({
model: anthropic("claude-3-5-haiku-latest"),
system: "You are a friendly assistant!",
prompt: (await req.data.text()) ?? "Why is the sky blue?",
});
return resp.stream(res.textStream);
}
Need more control or using an LLM provider's SDK directly? Agentuity gives you the flexibility. You can manually handle the provider's stream and pipe it into a ReadableStream for Agentuity, like this example using the Anthropic SDK:
import type { AgentRequest, AgentResponse, AgentContext } from "@agentuity/sdk";
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic();
export default async function Agent(
req: AgentRequest,
resp: AgentResponse,
ctx: AgentContext,
) {
const prompt = (await req.data.text()) || "Why is the sky blue?";
const anthropicStream = anthropic.messages.stream({
messages: [{ role: 'user', content: prompt }],
model: 'claude-3-5-sonnet-latest',
max_tokens: 1024,
system: "You are a friendly assistant!",
});
// Create a new ReadableStream to manually push text chunks
const readableStream = new ReadableStream({
async start(controller) {
try {
for await (const chunk of anthropicStream) {
// We only care about the 'content_block_delta' with type 'text_delta'.
if (chunk.type === 'content_block_delta' && chunk.delta.type === 'text_delta') {
const encoder = new TextEncoder();
controller.enqueue(encoder.encode(chunk.delta.text));
}
// Could handle other chunk types here:
// else if (chunk.type === 'message_stop') {
// console.log('Anthropic stream finished.');
// }
}
// Close our stream once done.
controller.close();
} catch (error) {
ctx.logger.error("Error processing Anthropic stream: %s", error);
// Signal an error to the stream consumer
controller.error(error);
}
}
});
return resp.stream(readableStream);
}
You can even stream the output from one agent directly to another:
import type { AgentRequest, AgentResponse, AgentContext } from "@agentuity/sdk";
export default async function Agent(
req: AgentRequest,
resp: AgentResponse,
ctx: AgentContext,
) {
// Setup to talk to another agent
const agent = await ctx.getAgent({
name: 'HistoryAgent',
});
// Invoke the agent
const agentResponse = await agent.run({
data: 'What engine did a P-51D Mustang use?',
});
// Get the stream from the agent
const stream = await agentResponse.data.stream();
// Return the stream to the client
return resp.stream(stream);
}
Streaming isn't just a feature; it's a fundamental improvement to how users interact with AI. By enabling streaming responses, you make your agents faster, more engaging, and ultimately more useful.