Skip to main content
Agno’s defaults work well for most use cases. But if you’re seeing slow searches, memory issues, or poor results, a few strategic changes might help.

Quick Wins

1. Choose the Right Vector Database

Database choice has the biggest impact at scale:
DatabaseUse Case
LanceDB/ChromaDBDevelopment, testing (zero setup)
PgVectorProduction up to 1M docs, need SQL
PineconeManaged service, auto-scaling
from agno.vectordb.lancedb import LanceDb
from agno.vectordb.pgvector import PgVector

# Development
dev_db = LanceDb(table_name="docs", uri="./local_db")

# Production
prod_db = PgVector(table_name="docs", db_url=db_url)

2. Skip Already-Processed Files

The biggest speed-up when re-running ingestion:
knowledge.insert(
    path="documents/",
    skip_if_exists=True,  # Don't reprocess existing files
)

# Batch loading with filters
knowledge.insert_many(
    paths=["docs/", "policies/"],
    skip_if_exists=True,
    include=["*.pdf", "*.md"],
    exclude=["*temp*", "*draft*"]
)

3. Use Metadata Filters

Narrow searches before search:
# Slow: search everything
results = knowledge.search("deployment process")

# Fast: filter first, then search
results = knowledge.search(
    query="deployment process",
    filters={"department": "engineering", "type": "procedure"}
)

# Validate filters to catch typos
valid_filters, invalid_keys = knowledge.validate_filters({
    "department": "engineering",
    "invalid_key": "value"  # This gets flagged
})

4. Match Chunking to Content

StrategySpeedQualityBest For
Fixed SizeFastGoodUniform content
SemanticSlowerBestComplex documents
RecursiveFastGoodStructured docs
from agno.knowledge.chunking.fixed_size_chunking import FixedSizeChunking
from agno.knowledge.chunking.semantic_chunking import SemanticChunking

# Fast processing
FixedSizeChunking(chunk_size=5000, overlap=200)

# Better quality (slower)
SemanticChunking(similarity_threshold=0.5)

5. Use Async for Batch Operations

Process multiple sources concurrently:
import asyncio

async def load_knowledge():
    await asyncio.gather(
        knowledge.ainsert(path="docs/hr/"),
        knowledge.ainsert(path="docs/engineering/"),
        knowledge.ainsert(url="https://company.com/api-docs"),
    )

asyncio.run(load_knowledge())

Common Issues

Irrelevant Search Results

Causes: Chunks too large/small, wrong chunking strategy. Fixes:
  • Try semantic chunking for better context
  • Increase max_results to check if relevant results are ranked lower
  • Add metadata filters to narrow scope
# Debug search quality
results = knowledge.search("your query", max_results=10)
for doc in results:
    print(doc.content[:200])

Slow Content Loading

Causes: Reprocessing existing files, semantic chunking on large datasets. Fixes:
  • Use skip_if_exists=True
  • Switch to fixed-size chunking
  • Process in batches
# Only process new PDFs
knowledge.insert(
    path="documents/",
    include=["*.pdf"],
    exclude=["*draft*", "*backup*"],
    skip_if_exists=True,
)

Memory Issues

Causes: Loading too many large files at once, chunk sizes too large. Fixes:
  • Process in smaller batches
  • Reduce chunk size
  • Use include/exclude patterns
  • Clear outdated content with knowledge.remove_content_by_id(content_id)

Advanced Optimizations

Combine vector and keyword search:
from agno.vectordb.pgvector import PgVector, SearchType

vector_db = PgVector(
    table_name="docs",
    db_url=db_url,
    search_type=SearchType.hybrid,
)

Reranking

Improve result ordering:
from agno.knowledge.reranker.cohere import CohereReranker

vector_db = PgVector(
    table_name="docs",
    db_url=db_url,
    reranker=CohereReranker(model="rerank-v3.5", top_n=10),
)

Smaller Embedding Dimensions

Trade slight quality for faster search:
from agno.knowledge.embedder.openai import OpenAIEmbedder

embedder = OpenAIEmbedder(
    id="text-embedding-3-large",
    dimensions=1024,  # Instead of 3072
)

Monitoring

import time

# Time searches
start = time.time()
results = knowledge.search("test query", max_results=5)
print(f"Search: {time.time() - start:.2f}s")

# Check failed content
content_list, total = knowledge.get_content()
for content in content_list:
    if content.status == "failed":
        status, message = knowledge.get_content_status(content.id)
        print(f"{content.name}: {message}")

Next Steps