Skip to main content
This example combines three techniques for optimal retrieval:
  1. Agentic RAG: Agent decides when to search the knowledge base
  2. Hybrid search: Combines vector similarity with keyword matching
  3. Reranking: Reorders results using a dedicated ranking model
from agno.agent import Agent
from agno.knowledge.knowledge import Knowledge
from agno.knowledge.embedder.cohere import CohereEmbedder
from agno.knowledge.reranker.cohere import CohereReranker
from agno.models.anthropic import Claude
from agno.vectordb.lancedb import LanceDb, SearchType

knowledge = Knowledge(
    vector_db=LanceDb(
        uri="tmp/lancedb",
        table_name="docs",
        search_type=SearchType.hybrid,
        embedder=CohereEmbedder(id="embed-v4.0"),
        reranker=CohereReranker(model="rerank-v3.5"),
    ),
)

agent = Agent(
    model=Claude(id="claude-sonnet-4-5"),
    knowledge=knowledge,
    search_knowledge=True,
)

Why Combine These Techniques

TechniqueWhat It Does
Agentic RAGAgent searches only when needed, can reformulate queries
Hybrid searchCatches both semantic matches and exact terms
RerankingUses a dedicated model to reorder results by relevance
Together, these provide better retrieval accuracy than any single technique alone.

How Reranking Works

After hybrid search returns initial results, the reranker:
  1. Takes the query and candidate documents
  2. Scores each document for relevance using a cross-encoder model
  3. Reorders results so the most relevant appear first
Cohere’s rerank-v3.5 is trained specifically for this task and significantly improves result quality.

Example

agentic_rag.py
import asyncio

from agno.agent import Agent
from agno.knowledge.embedder.cohere import CohereEmbedder
from agno.knowledge.knowledge import Knowledge
from agno.knowledge.reranker.cohere import CohereReranker
from agno.models.anthropic import Claude
from agno.vectordb.lancedb import LanceDb, SearchType

# Create knowledge base with hybrid search and reranking
knowledge = Knowledge(
    vector_db=LanceDb(
        uri="tmp/lancedb",
        table_name="agno_docs",
        search_type=SearchType.hybrid,
        embedder=CohereEmbedder(id="embed-v4.0"),
        reranker=CohereReranker(model="rerank-v3.5"),
    ),
)

# Load content
asyncio.run(
    knowledge.ainsert(url="https://docs.agno.com/introduction/agents.md")
)

# Create agent with knowledge
agent = Agent(
    model=Claude(id="claude-sonnet-4-20250514"),
    knowledge=knowledge,
    search_knowledge=True,
    instructions=[
        "Search your knowledge before answering.",
        "Include sources in your response.",
    ],
    markdown=True,
)

agent.print_response("What are Agents?", stream=True)

Usage

1

Set up your virtual environment

uv venv --python 3.12
source .venv/bin/activate
2

Install dependencies

uv pip install -U agno anthropic cohere lancedb tantivy sqlalchemy
3

Export your API keys

export ANTHROPIC_API_KEY=your_anthropic_api_key_here
export CO_API_KEY=your_cohere_api_key_here
4

Run Agent

python agentic_rag.py

Configuration Options

Different Rerankers

# Cohere
from agno.knowledge.reranker.cohere import CohereReranker
reranker = CohereReranker(model="rerank-v3.5")

# Add to vector database
vector_db = LanceDb(
    uri="tmp/lancedb",
    table_name="docs",
    search_type=SearchType.hybrid,
    reranker=reranker,
)

Adjusting Results

knowledge = Knowledge(
    vector_db=vector_db,
    max_results=10,  # Number of results to return after reranking
)

Next Steps