Summer Contributions - LLM as a Judge

Community Contributions

First, a little about Joel:

Joel @joel13samuel is a student at University of Florida that is working with us this summer.

He is a rising junior at the University of Florida studying Computer Science. Joel is currently involved with Florida Community Innovation (FCI), a small nonprofit that connects financially struggling individuals with local resources. At FCI, Joel contributed to front-end development. Outside of programming, Joel loves to play volleyball, and is on the Mens indoor volleyball club team at school. He also loves spending time going on hikes or chilling with friends at the springs.

Also, he's been with us at a few conferences and let me tell you: he's got the Agentuity groove going on pretty well!

Joel

He just committed common and important pattern in the agentic world: LLM as a Judge:

Check it out here

In Joel's own words, here are the details:

What It Does

This project uses Agentuity to create a multi-agent system where one AI agent (ContentWriter) creates blog posts, and another AI agent (Jury) evaluates them on multiple criteria using different AI models.

How It Works

The ContentWriter agent receives a topic and uses OpenAI to generate a high-quality blog post
The Jury agent receives the blog post and evaluates it using multiple specialized "judge" LLMs
Each judge evaluates the blog post on specific criteria and provides scores out of 10
The Jury agent combines all evaluations and returns a comprehensive assessment

Agent details

ContentWriter

Uses the Mastra framework with OpenAI's gpt-4o-mini model to generate blog posts with:

Engaging titles Clear introductions Well-organized body paragraphs with subheadings Strong conclusions

Jury

A multi-model evaluation system that provides balanced assessment using:

Default Models:

GPT-4o Mini: Precise and thorough evaluator
GPT-4o: Critical and detailed evaluator focused on technical merits
Claude: Pretty cool model I can't lie

Evaluation Criteria:

Clarity
Structure
Engagement
Technical accuracy

How to Use It

Check the README for more details and to make it your own on Agentuity. It's as simple as:

Clone the repo
agentuity project import
agentuity deploy

Using the agent

Open the DevMode URL provided when you start agentuity dev
Generate Content: Select ContentWriter agent → Enter a topic → Get blog post
Evaluate Content: Select Jury agent → Paste blog post → Get detailed evaluation

Via the test CLI client

# Generate a blog post on a topic
bun run index.ts ContentWriter "artificial intelligence"

# Evaluate a blog post
bun run index.ts Jury "Your blog post content here..."

# Run the full workflow (ContentWriter -> Jury)
bun run index.ts workflow "technology trends"

Community Spotlight

Want to contribute to our summer series? Share your Agentuity projects with us on Discord or tag us on social media.