Back

Agentuity v1 Reaches Beta

January 7, 2026 by The Agentuity Team

Agentuity v1 Beta

We're excited to announce that Agentuity v1 has reached beta. Since launching the public preview in early December the team has been heads-down building, and this release marks a major step toward our official launch:

  • Sandbox infrastructure for isolated code execution in any language
  • SSH access to running agents for live debugging
  • Framework-agnostic frontend with type-safe RPC
  • Built-in authentication powered by BetterAuth
  • First-class evaluations to test agent quality
  • And much more!

Here's a look at some of the highlights.

Sandbox Infrastructure

Sandboxes give your agents isolated Linux containers with their own filesystem, network, and configurable resources. Your agents are written in TypeScript, but sandboxes can run any language: Python, Node.js, shell scripts, or anything available via apt install.

Use sandboxes for code execution agents, AI coding assistants, automated testing, or validating generated code before returning it to users.

# List your sandboxes
agentuity cloud sandbox list

# SSH into a sandbox
agentuity cloud ssh --sandbox

On the SDK side, sandboxes integrate naturally with your agent code. Each sandbox gets dedicated memory, CPU, and disk, all visible from the CLI.

Type-Safe Frontend and RPC

We introduced the @agentuity/frontend package and type-safe API calling in our December 19 update. Since then, we've added type-safe path and query parameters with automatic extraction from route definitions, ergonomic positional arguments for path params, and fixed optional params handling in the useAPI hook.

import { createAPIClient } from '@agentuity/react';

const api = createAPIClient();

// Type-safe path params with positional arguments
const user = await api.users.userId.get('123');
const member = await api.organizations.orgId.members.memberId.get('org-456', 'user-789');

// Type-safe query params
const results = await api.search.get({ query: { q: 'agents', limit: '10' } });

For the full API, check out the RPC client docs.

Built-in Authentication

The @agentuity/auth package provides first-class authentication powered by BetterAuth. It includes API key support, organization management, JWT tokens, and CLI commands for setup:

# Initialize auth in an existing project
agentuity project auth init

# Manage API keys
agentuity auth apikey

On the server side, protect your routes with middleware:

import { createSessionMiddleware } from '@agentuity/auth';

router.get('/api/profile', createSessionMiddleware(auth), async (c) => {
  const user = c.var.user;
  return c.json({ id: user.id });
});

Workbench

We covered Workbench setup in our December 19 update. Since then, we've added execution metadata (token counts and duration) in thread messages, URL deep linking so you can share workbench URLs with pre-selected agents, and minification-safe schema detection for production builds. You can now also run Workbench against live deployed agents directly from the web app. Run agentuity dev and open /workbench to try it locally.

Workbench interface

Built-in Evals Suite

New to the beta: a first-class evaluations system for testing your agents. Evals are automated tests that run after your agent completes, validating output quality and monitoring performance without blocking responses.

The @agentuity/evals package ships with 12 preset evaluations including politeness, safety, PII detection, conciseness, adversarial prompt detection, and more. You can also write custom evals for your specific use cases.

Evals come in two types:

  • Binary (pass/fail) for yes/no criteria like safety checks
  • Score (0-1) for quality gradients like relevance or helpfulness

Results appear in the web app, so you can track agent quality over time. No external tooling required.

Evals results in the web app

SSH into Running Agents

Need to debug a live agent? You can now SSH directly into running cloud agents and sandboxes:

# SSH into a running deployment
agentuity cloud ssh

# Copy files from a running agent
agentuity cloud scp download /path/to/file ./local-file

This makes debugging production issues much easier. No more guessing what's happening inside your agent. For setup instructions, see the debugging docs.

Explore the SDK (and More)

The new SDK Explorer is an interactive playground for learning the Agentuity platform. Browse live examples covering services (KV, Vector, Object Storage, AI Gateway), I/O patterns (streaming, SSE, durable streams), and more. Each example includes plain-language explanations of the concepts involved, plus reference code you can copy and drop into your own projects.

SDK Explorer

Also Shipping

Beyond the highlights above, here's what else we've been building:

VS Code Extension

  • Agent, Data, and Deployment explorers in the sidebar
  • AI chat participant for workspace-aware assistance
  • Dev server lifecycle management from the IDE
  • 8 new language model tools for managing agents and deployments

GitHub Integration

  • agentuity git link and git unlink commands
  • Connect repos for automatic deployments
  • agentuity integration github connect/disconnect

Configuration System

  • Next.js-style agentuity.config.ts with Vite HMR and plugin support
  • Custom Bun plugins, Tailwind CSS, and compile-time constants
  • All build phases respect custom configuration

New Default Template

  • The agentuity create command now scaffolds a translation agent demo
  • Includes AI Gateway integration, React frontend with Tailwind, thread state, and evals
  • A hands-on starting point for exploring the platform

Developer Experience

  • Rust-style TypeScript error formatting
  • Async lazy-loaded thread state (eliminates 100-150ms latency)
  • Ergonomic positional arguments for RPC path params
  • Improved DNS developer experience for custom domains

Stability

  • 52 bug fixes including dev server stability, hot reload, type inference, and S3 compatibility
  • Better process cleanup and stale process detection
  • Terminal cursor restoration and CI-friendly output

All changes are backward compatible.

Upgrade Now

Ready to try the beta? Run:

agentuity upgrade

Or if you're starting fresh:

curl -sSL https://v1.agentuity.sh | sh

Once you're on the beta, try out a few of the new features:

  • Launch Workbench with agentuity dev and open /workbench
  • List your sandboxes with agentuity cloud sandbox list
  • Add evals to your agents using the @agentuity/evals package

Resources

We'd love your feedback as we work toward 1.0. Happy building!