Context Window
AIContext window is the maximum amount of input—usually measured in tokens—that an LLM can process in a single request. It limits how much retrieved content, conversation history, and instructions can be sent to the model.
Impact on RAG
For RAG, the context window constrains:
- Retrieved chunks — Top-k × chunk size must fit
- Conversation history — Sessions may compress older messages
- Instructions — System prompts and schema instructions consume tokens
DocLD manages context by chunking documents, limiting top-k, and using reranking to select the most relevant chunks before generation.
Related Concepts
Context window limits RAG and extraction. Chunking and top-k keep context within bounds. Tokenization determines token count for a given text length.