Agentic RAG with Reranking

This example combines three techniques for optimal retrieval:

Agentic RAG: Agent decides when to search the knowledge base
Hybrid search: Combines vector similarity with keyword matching
Reranking: Reorders results using a dedicated ranking model

from agno.agent import Agent
from agno.knowledge.knowledge import Knowledge
from agno.knowledge.embedder.cohere import CohereEmbedder
from agno.knowledge.reranker.cohere import CohereReranker
from agno.models.anthropic import Claude
from agno.vectordb.lancedb import LanceDb, SearchType

knowledge = Knowledge(
    vector_db=LanceDb(
        uri="tmp/lancedb",
        table_name="docs",
        search_type=SearchType.hybrid,
        embedder=CohereEmbedder(id="embed-v4.0"),
        reranker=CohereReranker(model="rerank-v3.5"),
    ),
)

agent = Agent(
    model=Claude(id="claude-sonnet-4-5"),
    knowledge=knowledge,
    search_knowledge=True,
)

Why Combine These Techniques

Technique	What It Does
Agentic RAG	Agent searches only when needed, can reformulate queries
Hybrid search	Catches both semantic matches and exact terms
Reranking	Uses a dedicated model to reorder results by relevance

Together, these provide better retrieval accuracy than any single technique alone.

How Reranking Works

After hybrid search returns initial results, the reranker:

Takes the query and candidate documents
Scores each document for relevance using a cross-encoder model
Reorders results so the most relevant appear first

Cohere’s rerank-v3.5 is trained specifically for this task and significantly improves result quality.

Example

agentic_rag.py

import asyncio

from agno.agent import Agent
from agno.knowledge.embedder.cohere import CohereEmbedder
from agno.knowledge.knowledge import Knowledge
from agno.knowledge.reranker.cohere import CohereReranker
from agno.models.anthropic import Claude
from agno.vectordb.lancedb import LanceDb, SearchType

# Create knowledge base with hybrid search and reranking
knowledge = Knowledge(
    vector_db=LanceDb(
        uri="tmp/lancedb",
        table_name="agno_docs",
        search_type=SearchType.hybrid,
        embedder=CohereEmbedder(id="embed-v4.0"),
        reranker=CohereReranker(model="rerank-v3.5"),
    ),
)

# Load content
asyncio.run(
    knowledge.ainsert(url="https://docs.agno.com/introduction/agents.md")
)

# Create agent with knowledge
agent = Agent(
    model=Claude(id="claude-sonnet-4-20250514"),
    knowledge=knowledge,
    search_knowledge=True,
    instructions=[
        "Search your knowledge before answering.",
        "Include sources in your response.",
    ],
    markdown=True,
)

agent.print_response("What are Agents?", stream=True)

Usage

Set up your virtual environment

uv venv --python 3.12
source .venv/bin/activate

Install dependencies

uv pip install -U agno anthropic cohere lancedb tantivy sqlalchemy

Export your API keys

export ANTHROPIC_API_KEY=your_anthropic_api_key_here
export CO_API_KEY=your_cohere_api_key_here

Run Agent

python agentic_rag.py

Configuration Options

Different Rerankers

# Cohere
from agno.knowledge.reranker.cohere import CohereReranker
reranker = CohereReranker(model="rerank-v3.5")

# Add to vector database
vector_db = LanceDb(
    uri="tmp/lancedb",
    table_name="docs",
    search_type=SearchType.hybrid,
    reranker=reranker,
)

Adjusting Results

knowledge = Knowledge(
    vector_db=vector_db,
    max_results=10,  # Number of results to return after reranking
)

Get Started

Basics

Advanced

Other

Agentic RAG with Reranking

Why Combine These Techniques

How Reranking Works

Example

Usage

Configuration Options

Different Rerankers

Adjusting Results

Next Steps

Hybrid Search

Embedders

Get Started

Basics

Advanced

Other

​Why Combine These Techniques

​How Reranking Works

​Example

​Usage

​Configuration Options

​Different Rerankers

​Adjusting Results

​Next Steps

Hybrid Search

Embedders

Why Combine These Techniques

How Reranking Works

Example

Usage

Configuration Options

Different Rerankers

Adjusting Results

Next Steps