Context Window | Glossary | DocLD

Context window is the maximum amount of input—usually measured in tokens—that an LLM can process in a single request. It limits how much retrieved content, conversation history, and instructions can be sent to the model.

Impact on RAG

For RAG, the context window constrains:

Retrieved chunks — Top-k × chunk size must fit
Conversation history — Sessions may compress older messages
Instructions — System prompts and schema instructions consume tokens

DocLD manages context by chunking documents, limiting top-k, and using reranking to select the most relevant chunks before generation.

Context window limits RAG and extraction. Chunking and top-k keep context within bounds. Tokenization determines token count for a given text length.

Frequently Asked Questions

Impact on RAG

For RAG, the context window constrains:

Retrieved chunks — Top-k × chunk size must fit
Conversation history — Sessions may compress older messages
Instructions — System prompts and schema instructions consume tokens

DocLD manages context by chunking documents, limiting top-k, and using reranking to select the most relevant chunks before generation.

Context window limits RAG and extraction. Chunking and top-k keep context within bounds. Tokenization determines token count for a given text length.

Impact on RAG

Related Concepts

Frequently Asked Questions

Impact on RAG

Related Concepts

Frequently Asked Questions