Reranking | Glossary | DocLD

Reranking reorders vector search results by relevance before passing them to the LLM. Initial retrieval returns top-k similar chunks; reranking refines the order so the most relevant chunks are prioritized. This improves RAG answer quality by ensuring the LLM receives the best context.

Reranking Options in DocLD

Type	Description	Trade-off
Heuristic	Keyword and phrase boosts; scoring based on query-term overlap	Fast, low cost; good for most cases
LLM	Model scores each chunk for relevance to the query	Higher accuracy; higher cost and latency
Hybrid	Combines heuristic and LLM scoring	Balance of speed, cost, and quality

Default heuristic reranking works well for most cases. Consider LLM reranking when answers miss key information or irrelevant chunks appear in context.

When to Use Each Type

Scenario	Recommended Type
General Q&A	Heuristic
Domain-specific or nuanced queries	LLM or hybrid
High-stakes answers (legal, financial)	LLM
Low latency required	Heuristic
Mixed document types	Hybrid

Reranking affects latency and cost. Heuristic reranking adds minimal latency; LLM reranking adds a model call per chunk. DocLD lets you configure the strategy per knowledge base or chat session.

How Reranking Improves RAG

Retrieve — Vector search returns top-k chunks (e.g., 20)
Rerank — Reranking reorders these chunks by relevance
Select — Top N chunks (e.g., 5–10) are passed to the LLM
Generate — The LLM receives the best context for RAG answer generation

By retrieving more chunks and reranking, you improve recall while ensuring the LLM gets the most relevant subset. This is especially useful when vector search returns some near-misses that heuristic or LLM scoring can demote.

Configuration

Configure reranking per knowledge base or chat session:

Strategy — Heuristic, LLM, or hybrid
Top-N — How many chunks to pass to the LLM after reranking
Thresholds — Optional minimum relevance score to exclude low-quality chunks

Reranking refines vector search results. Top-k controls how many chunks are retrieved before reranking. RAG uses reranked chunks as context for the LLM.

Frequently Asked Questions

Reranking Options in DocLD

Type	Description	Trade-off
Heuristic	Keyword and phrase boosts; scoring based on query-term overlap	Fast, low cost; good for most cases
LLM	Model scores each chunk for relevance to the query	Higher accuracy; higher cost and latency
Hybrid	Combines heuristic and LLM scoring	Balance of speed, cost, and quality

Default heuristic reranking works well for most cases. Consider LLM reranking when answers miss key information or irrelevant chunks appear in context.

When to Use Each Type

Scenario	Recommended Type
General Q&A	Heuristic
Domain-specific or nuanced queries	LLM or hybrid
High-stakes answers (legal, financial)	LLM
Low latency required	Heuristic
Mixed document types	Hybrid

Reranking affects latency and cost. Heuristic reranking adds minimal latency; LLM reranking adds a model call per chunk. DocLD lets you configure the strategy per knowledge base or chat session.

How Reranking Improves RAG

Retrieve — Vector search returns top-k chunks (e.g., 20)

Rerank — Reranking reorders these chunks by relevance

Select — Top N chunks (e.g., 5–10) are passed to the LLM

Generate — The LLM receives the best context for RAG answer generation

Frequently Asked Questions

Reranking Options in DocLD

When to Use Each Type

How Reranking Improves RAG

Configuration

Related Concepts

Frequently Asked Questions

Reranking Options in DocLD

When to Use Each Type

How Reranking Improves RAG

Configuration

Related Concepts

Frequently Asked Questions