LLM | Glossary | DocLD

An LLM (Large Language Model) is an AI model trained on vast amounts of text to generate and understand language. DocLD uses LLMs for RAG chat (generating answers from retrieved context) and extraction (pulling structured data from documents). The LLM is instructed via prompts and receives document chunks or full documents as context.

How LLMs Are Used in DocLD

Use Case	LLM Role	Context
RAG chat	Generate answers	Retrieved chunks from vector search
Extraction	Extract field values	Document content + schema instructions
Reranking	Score relevance	Optional reranking step

The LLM is instructed to stay within context and cite sources. Citations show where information came from, reducing hallucination.

LLM Behavior

Context window — The LLM has a finite context window; chunking and top-k keep retrieved content within limits
Instructions — System prompts and schema instructions guide behavior
Temperature — Controls randomness; DocLD uses lower temperatures for factual tasks

For RAG, the LLM receives only the retrieved chunks, not your entire document library. This keeps answers grounded and reduces hallucination.

Why LLMs Matter for Document Intelligence

Understanding — LLMs interpret document content semantically, not just keyword matching
Flexibility — Zero-shot extraction works without training on your documents
Generative — Chat produces natural-language answers with citations
Adaptable — Schema instructions and prompts adapt to different use cases

LLMs power RAG and extraction. Hallucination is a risk; citations and grounding in retrieved context reduce it. Chunking and vector search determine what context the LLM receives.

Frequently Asked Questions

How LLMs Are Used in DocLD

Use Case	LLM Role	Context
RAG chat	Generate answers	Retrieved chunks from vector search
Extraction	Extract field values	Document content + schema instructions
Reranking	Score relevance	Optional reranking step

The LLM is instructed to stay within context and cite sources. Citations show where information came from, reducing hallucination.

LLM Behavior

Context window — The LLM has a finite context window; chunking and top-k keep retrieved content within limits
Instructions — System prompts and schema instructions guide behavior
Temperature — Controls randomness; DocLD uses lower temperatures for factual tasks

For RAG, the LLM receives only the retrieved chunks, not your entire document library. This keeps answers grounded and reduces hallucination.

Why LLMs Matter for Document Intelligence

Understanding — LLMs interpret document content semantically, not just keyword matching
Flexibility — Zero-shot extraction works without training on your documents
Generative — Chat produces natural-language answers with citations
Adaptable — Schema instructions and prompts adapt to different use cases

LLMs power RAG and extraction. Hallucination is a risk; citations and grounding in retrieved context reduce it. Chunking and vector search determine what context the LLM receives.

How LLMs Are Used in DocLD

LLM Behavior

Why LLMs Matter for Document Intelligence

Related Concepts

Frequently Asked Questions

How LLMs Are Used in DocLD

LLM Behavior

Why LLMs Matter for Document Intelligence

Related Concepts

Frequently Asked Questions