Evals Quick Start Guide
This guide will walk you through setting up your first evaluation pipeline with VoltAgent and Viteval. In just a few minutes, you'll have a working eval system that can measure your agent's performance.
Prerequisites
Before starting, make sure you have:
- A VoltAgent project set up with
@voltagent/core
- Node.js 22+ installed
- An AI provider configured (OpenAI, Anthropic, etc.)
Installation
Install Viteval as a development dependency:
- npm
- yarn
- pnpm
npm install viteval --save-dev
yarn add viteval --dev
pnpm add viteval --save-dev
Quick Setup
1. Set up VoltAgent
viteval init
This will create a viteval.config.ts
and viteval.setup.ts
file in your project root.
2. Viteval Setup File
Uncomment the setup file content to use env variables or remove it if you don't need it:
// viteval.setup.ts
import dotenv from "dotenv";
dotenv.config({ path: "./.env", quiet: true });
3. Configure Viteval (Optional)
Update the Viteval configuration file:
// viteval.config.ts
import { defineConfig } from "viteval/config";
export default defineConfig({
reporter: "console",
eval: {
include: ["src/**/*.eval.ts"],
setupFiles: ["./viteval.setup.ts"],
},
});
4. Create Your Agent
First, create your VoltAgent agent:
// src/agents/support.ts
import { Agent } from "@voltagent/core";
import { VercelAIProvider } from "@voltagent/vercel-ai";
import { openai } from "@ai-sdk/openai";
export const supportAgent = new Agent({
name: "Customer Support",
instructions:
"You are a helpful customer support agent. Provide accurate and friendly assistance.",
llm: new VercelAIProvider(),
model: openai("gpt-4o-mini"),
});
5. Create Test Dataset
Define your test cases in a dataset file:
// src/agents/support.dataset.ts
import { defineDataset } from "viteval/dataset";
export default defineDataset({
name: "support",
data: async () => [
{
input: "What is your refund policy?",
expected: "Our refund policy allows returns within 30 days of purchase with a valid receipt.",
},
{
input: "How long does shipping take?",
expected: "Standard shipping takes 3-5 business days, express shipping takes 1-2 days.",
},
{
input: "Hello, I need help with my order",
expected:
"Hello! I'd be happy to help you with your order. What specific assistance do you need?",
},
],
});
You can also use an LLM to generate the dataset dynamically. See an example in Viteval Example
6. Create Evaluation File
Create the evaluation logic:
// src/agents/support.eval.ts
import { evaluate, scorers } from "viteval";
import { supportAgent } from "./support";
import supportDataset from "./support.dataset";
evaluate("Customer Support Agent", {
description: "Evaluates customer support agent capabilities",
data: supportDataset,
task: async ({ input }) => {
const result = await supportAgent.generateText(input);
return result.text;
},
scorers: [scorers.answerCorrectness, scorers.answerRelevancy, scorers.moderation],
threshold: 0.7,
});
You can learn more about Viteval scorers by visiting the Viteval Scorers documentation.
7. Add NPM Script
Add a script to your package.json
:
{
"scripts": {
"eval": "viteval"
}
}
8. Run Your First Evaluation
npm run eval
You'll see output like:
✓ Customer Support Agent (3/3 passed)
✓ answerCorrectness: 0.85
✓ answerRelevancy: 0.82
✓ moderation: 0.98
Overall: 0.883 (threshold: 0.7) ✓
Next Steps
- View Available Scorers: Understanding evaluation metrics
- CI Integration: Integrate Viteval into your CI pipeline