Build Your Own Retriever

Want to connect your AI agent to your own database, API, or files? Here's how to build a custom retriever in 5 minutes.

The Pattern (Copy & Paste)

Every retriever follows the same simple pattern:

import { BaseRetriever } from "@voltagent/core";

class MyRetriever extends BaseRetriever {
  async retrieve(input, options) {
    // 1. Get the user's question
    const question = typeof input === "string" ? input : input[input.length - 1].content;

    // 2. Search your data source
    const results = await this.searchMyData(question);

    // 3. Return formatted results
    return results.join("\n\n");
  }

  async searchMyData(query) {
    // Replace this with your actual search logic
    return ["Sample result 1", "Sample result 2"];
  }
}

Real Examples

Search Local Files

import { BaseRetriever } from "@voltagent/core";
import fs from "fs";
import path from "path";

class FileRetriever extends BaseRetriever {
  constructor(docsPath = "./docs") {
    super({
      toolName: "search_files",
      toolDescription: "Search through local documentation files",
    });
    this.docsPath = docsPath;
  }

  async retrieve(input, options) {
    const query = typeof input === "string" ? input : input[input.length - 1].content;

    // Read all .md files
    const files = fs.readdirSync(this.docsPath).filter((file) => file.endsWith(".md"));

    const results = [];
    for (const file of files) {
      const content = fs.readFileSync(path.join(this.docsPath, file), "utf8");
      if (content.toLowerCase().includes(query.toLowerCase())) {
        results.push(`File: ${file}\n${content.slice(0, 500)}...`);
      }
    }

    return results.length > 0 ? results.join("\n\n---\n\n") : "No relevant files found.";
  }
}

Search PostgreSQL Database

import { BaseRetriever } from "@voltagent/core";
import { Pool } from "pg";

class PostgreSQLRetriever extends BaseRetriever {
  constructor(connectionString) {
    super({
      toolName: "search_database",
      toolDescription: "Search the company knowledge database",
    });
    this.pool = new Pool({ connectionString });
  }

  async retrieve(input, options) {
    const query = typeof input === "string" ? input : input[input.length - 1].content;

    // Search using PostgreSQL full-text search
    const result = await this.pool.query(
      `
      SELECT title, content, ts_rank(search_vector, plainto_tsquery($1)) as rank
      FROM documents 
      WHERE search_vector @@ plainto_tsquery($1)
      ORDER BY rank DESC
      LIMIT 5
    `,
      [query]
    );

    return result.rows.map((row) => `${row.title}: ${row.content}`).join("\n\n---\n\n");
  }
}

Call External API

import { BaseRetriever } from "@voltagent/core";

class APIRetriever extends BaseRetriever {
  constructor(apiKey) {
    super({
      toolName: "search_api",
      toolDescription: "Search external knowledge API",
    });
    this.apiKey = apiKey;
  }

  async retrieve(input, options) {
    const query = typeof input === "string" ? input : input[input.length - 1].content;

    const response = await fetch("https://api.example.com/search", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${this.apiKey}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ query, limit: 5 }),
    });

    const data = await response.json();

    return data.results.map((item) => `${item.title}: ${item.summary}`).join("\n\n---\n\n");
  }
}

Track Sources (Optional)

Want to show users where the information came from? Use userContext to track sources:

class SourceTrackingRetriever extends BaseRetriever {
  async retrieve(input, options) {
    const query = typeof input === "string" ? input : input[input.length - 1].content;

    // Your search logic here
    const results = await this.searchData(query);

    // Save sources for later reference
    if (options.userContext && results.length > 0) {
      const sources = results.map((r) => ({
        title: r.title,
        url: r.url,
        score: r.score,
      }));

      options.userContext.set("sources", sources);
    }

    return results.map((r) => r.content).join("\n\n");
  }
}

// Use it
const response = await agent.generateText("How do I deploy?");
console.log("Answer:", response.text);

// Check what sources were used
const sources = response.userContext?.get("sources");
sources?.forEach((s) => console.log(`Source: ${s.title} (${s.url})`));

Why track sources?

Show users where info came from
Debug what the retriever found
Compliance and audit trails
Better user experience

How to Use Your Retriever

You can use your retriever in two ways:

Option 1: Always Search

const agent = new Agent({
  name: "Support Bot",
  retriever: new MyRetriever(), // Searches before every response
  // ... other config
});

When to use: Support bots, Q&A systems where you always want context

Option 2: Search When Needed

const retriever = new MyRetriever({
  toolName: "search_docs",
  toolDescription: "Search company documentation",
});

const agent = new Agent({
  name: "Smart Assistant",
  tools: [retriever.tool], // LLM decides when to search
  // ... other config
});

When to use: General assistants where you want the LLM to decide when to search

Quick Decision Guide

Use Case	Method	Why
Support bot	`agent.retriever`	Always needs context
Q&A system	`agent.retriever`	Every question needs search
General assistant	`agent.tools`	Let LLM decide when to search
Multi-tool agent	`agent.tools`	Mix with other tools

Pro Tips

Format your results clearly:

// ❌ Hard to parse
return results.join(" ");

// ✅ Easy to parse
return results.map((r) => `Source: ${r.title}\n${r.content}`).join("\n\n---\n\n");

Handle errors gracefully:

async retrieve(input, options) {
  try {
    const results = await this.searchData(input);
    return results.length > 0 ? results.join('\n\n') : "No results found.";
  } catch (error) {
    console.error('Search failed:', error);
    return "Search temporarily unavailable.";
  }
}

Make it fast:

// Add timeouts and limits
const results = await Promise.race([
  this.searchData(query),
  new Promise((_, reject) => setTimeout(() => reject(new Error("Timeout")), 5000)),
]);

Good tool descriptions:

// ❌ Vague
toolDescription: "Searches stuff";

// ✅ Specific
toolDescription: "Search company documentation, policies, and FAQ. Use when user asks about company procedures, benefits, or policies.";

Learn More

RAG Overview → - Complete guide to Retrieval-Augmented Generation
Chroma Integration → - Working example with Chroma vector database
Examples → - See retriever implementations in action

Build Your Own Retriever

The Pattern (Copy & Paste)​

Real Examples​

Search Local Files​

Search PostgreSQL Database​

Call External API​

Track Sources (Optional)​

How to Use Your Retriever​

Option 1: Always Search​

Option 2: Search When Needed​

Quick Decision Guide​

Pro Tips​

Learn More​

Table of Contents