Agentuity v1 Beta

Update: Agentuity v1 is now live! Read the launch announcement →

We're excited to announce that Agentuity v1 has reached beta. Since launching the public preview in early December the team has been heads-down building, and this release marks a major step toward our official launch:

Sandbox infrastructure for isolated code execution in any language
SSH access to running agents for live debugging
Framework-agnostic frontend with type-safe RPC
Built-in authentication powered by BetterAuth
First-class evaluations to test agent quality
And much more!

Here's a look at some of the highlights.

Sandbox Infrastructure

Sandboxes give your agents isolated Linux containers with their own filesystem, network, and configurable resources. Your agents are written in TypeScript, but sandboxes can run any language: Python, Node.js, shell scripts, or anything available via apt install.

Use sandboxes for code execution agents, AI coding assistants, automated testing, or validating generated code before returning it to users.

# List your sandboxes
agentuity cloud sandbox list

# SSH into a sandbox
agentuity cloud ssh --sandbox

On the SDK side, sandboxes integrate naturally with your agent code. Each sandbox gets dedicated memory, CPU, and disk, all visible from the CLI.

Type-Safe Frontend and RPC

We introduced the @agentuity/frontend package and type-safe API calling in our December 19 update. Since then, we've added type-safe path and query parameters with automatic extraction from route definitions, ergonomic positional arguments for path params, and fixed optional params handling in the useAPI hook.

import { createAPIClient } from '@agentuity/react';

const api = createAPIClient();

// Type-safe path params with positional arguments
const user = await api.users.userId.get('123');
const member = await api.organizations.orgId.members.memberId.get('org-456', 'user-789');

// Type-safe query params
const results = await api.search.get({ query: { q: 'agents', limit: '10' } });

For the full API, check out the RPC client docs.

Built-in Authentication

The @agentuity/auth package provides first-class authentication powered by BetterAuth. It includes API key support, organization management, JWT tokens, and CLI commands for setup:

# Initialize auth in an existing project
agentuity project auth init

# Manage API keys
agentuity auth apikey

On the server side, protect your routes with middleware:

import { createSessionMiddleware } from '@agentuity/auth';

router.get('/api/profile', createSessionMiddleware(auth), async (c) => {
  const user = c.var.user;
  return c.json({ id: user.id });
});

Workbench

We covered Workbench setup in our December 19 update. Since then, we've added execution metadata (token counts and duration) in thread messages, URL deep linking so you can share workbench URLs with pre-selected agents, and minification-safe schema detection for production builds. You can now also run Workbench against live deployed agents directly from the web app. Run agentuity dev and open /workbench to try it locally.

Workbench interface

Built-in Evals Suite

New to the beta: a first-class evaluations system for testing your agents. Evals are automated tests that run after your agent completes, validating output quality and monitoring performance without blocking responses.

The @agentuity/evals package ships with 12 preset evaluations including politeness, safety, PII detection, conciseness, adversarial prompt detection, and more. You can also write custom evals for your specific use cases.

Evals come in two types:

Binary (pass/fail) for yes/no criteria like safety checks
Score (0-1) for quality gradients like relevance or helpfulness

Results appear in the web app, so you can track agent quality over time. No external tooling required.

Evals results in the web app

SSH into Running Agents

Need to debug a live agent? You can now SSH directly into running cloud agents and sandboxes:

# SSH into a running deployment
agentuity cloud ssh

# Copy files from a running agent
agentuity cloud scp download /path/to/file ./local-file

This makes debugging production issues much easier. No more guessing what's happening inside your agent. For setup instructions, see the debugging docs.

Explore the SDK (and More)

The new SDK Explorer is an interactive playground for learning the Agentuity platform. Browse live examples covering services (KV, Vector, Object Storage, AI Gateway), I/O patterns (streaming, SSE, durable streams), and more. Each example includes plain-language explanations of the concepts involved, plus reference code you can copy and drop into your own projects.

SDK Explorer

Also Shipping

Beyond the highlights above, here's what else we've been building:

VS Code Extension

Agent, Data, and Deployment explorers in the sidebar
AI chat participant for workspace-aware assistance
Dev server lifecycle management from the IDE
8 new language model tools for managing agents and deployments

GitHub Integration

agentuity git link and git unlink commands
Connect repos for automatic deployments
agentuity integration github connect/disconnect

Configuration System

Next.js-style agentuity.config.ts with Vite HMR and plugin support
Custom Bun plugins, Tailwind CSS, and compile-time constants
All build phases respect custom configuration

New Default Template

The agentuity create command now scaffolds a translation agent demo
Includes AI Gateway integration, React frontend with Tailwind, thread state, and evals
A hands-on starting point for exploring the platform

Developer Experience

Rust-style TypeScript error formatting
Async lazy-loaded thread state (eliminates 100-150ms latency)
Ergonomic positional arguments for RPC path params
Improved DNS developer experience for custom domains

Stability

52 bug fixes including dev server stability, hot reload, type inference, and S3 compatibility
Better process cleanup and stale process detection
Terminal cursor restoration and CI-friendly output

All changes are backward compatible.

Upgrade Now

Ready to try the beta? Run:

agentuity upgrade

Or if you're starting fresh:

curl -fsSL https://agentuity.sh | sh

Once you're on the beta, try out a few of the new features:

Launch Workbench with agentuity dev and open /workbench
List your sandboxes with agentuity cloud sandbox list
Add evals to your agents using the @agentuity/evals package

Resources

We'd love your feedback as we work toward 1.0. Happy building!

Agentuity v1 Beta

Update: Agentuity v1 is now live! Read the launch announcement →

Sandbox infrastructure for isolated code execution in any language
SSH access to running agents for live debugging
Framework-agnostic frontend with type-safe RPC
Built-in authentication powered by BetterAuth
First-class evaluations to test agent quality
And much more!

Here's a look at some of the highlights.

Sandbox Infrastructure

Use sandboxes for code execution agents, AI coding assistants, automated testing, or validating generated code before returning it to users.

# List your sandboxes
agentuity cloud sandbox list

# SSH into a sandbox
agentuity cloud ssh --sandbox

On the SDK side, sandboxes integrate naturally with your agent code. Each sandbox gets dedicated memory, CPU, and disk, all visible from the CLI.

Type-Safe Frontend and RPC

import { createAPIClient } from '@agentuity/react';

const api = createAPIClient();

// Type-safe path params with positional arguments
const user = await api.users.userId.get('123');
const member = await api.organizations.orgId.members.memberId.get('org-456', 'user-789');

// Type-safe query params
const results = await api.search.get({ query: { q: 'agents', limit: '10' } });

For the full API, check out the RPC client docs.

Built-in Authentication

The @agentuity/auth package provides first-class authentication powered by BetterAuth. It includes API key support, organization management, JWT tokens, and CLI commands for setup:

# Initialize auth in an existing project
agentuity project auth init

# Manage API keys
agentuity auth apikey

On the server side, protect your routes with middleware:

import { createSessionMiddleware } from '@agentuity/auth';

router.get('/api/profile', createSessionMiddleware(auth), async (c) => {
  const user = c.var.user;
  return c.json({ id: user.id });
});

Workbench

Workbench interface

Built-in Evals Suite

Evals come in two types:

Binary (pass/fail) for yes/no criteria like safety checks
Score (0-1) for quality gradients like relevance or helpfulness

Results appear in the web app, so you can track agent quality over time. No external tooling required.

Evals results in the web app

SSH into Running Agents

Need to debug a live agent? You can now SSH directly into running cloud agents and sandboxes:

# SSH into a running deployment
agentuity cloud ssh

# Copy files from a running agent
agentuity cloud scp download /path/to/file ./local-file

This makes debugging production issues much easier. No more guessing what's happening inside your agent. For setup instructions, see the debugging docs.

Explore the SDK (and More)

SDK Explorer

Also Shipping

Beyond the highlights above, here's what else we've been building:

VS Code Extension

Agent, Data, and Deployment explorers in the sidebar
AI chat participant for workspace-aware assistance
Dev server lifecycle management from the IDE
8 new language model tools for managing agents and deployments

GitHub Integration

agentuity git link and git unlink commands
Connect repos for automatic deployments
agentuity integration github connect/disconnect

Configuration System

Next.js-style agentuity.config.ts with Vite HMR and plugin support
Custom Bun plugins, Tailwind CSS, and compile-time constants
All build phases respect custom configuration

New Default Template

The agentuity create command now scaffolds a translation agent demo
Includes AI Gateway integration, React frontend with Tailwind, thread state, and evals
A hands-on starting point for exploring the platform

Developer Experience

Rust-style TypeScript error formatting
Async lazy-loaded thread state (eliminates 100-150ms latency)
Ergonomic positional arguments for RPC path params
Improved DNS developer experience for custom domains

Stability

52 bug fixes including dev server stability, hot reload, type inference, and S3 compatibility
Better process cleanup and stale process detection
Terminal cursor restoration and CI-friendly output

All changes are backward compatible.

Upgrade Now

Ready to try the beta? Run:

agentuity upgrade

Or if you're starting fresh:

curl -fsSL https://agentuity.sh | sh

Once you're on the beta, try out a few of the new features:

Launch Workbench with agentuity dev and open /workbench
List your sandboxes with agentuity cloud sandbox list
Add evals to your agents using the @agentuity/evals package

Resources

We'd love your feedback as we work toward 1.0. Happy building!

Agentuity v1 Reaches Beta

Sandbox Infrastructure

Type-Safe Frontend and RPC

Built-in Authentication

Workbench

Built-in Evals Suite

SSH into Running Agents

Explore the SDK (and More)

Also Shipping

Upgrade Now

Resources

Agentuity v1 Reaches Beta

Sandbox Infrastructure

Type-Safe Frontend and RPC

Built-in Authentication

Workbench

Built-in Evals Suite

SSH into Running Agents

Explore the SDK (and More)

Also Shipping

Upgrade Now

Resources