Completion
AICompletion is the text (or structured output) that an LLM generates in response to an input prompt. In DocLD, RAG chat produces a completion that is grounded in retrieved chunks; extraction produces a completion in the shape of the schema (e.g., JSON).
Parameters
Temperature and top-p control randomness and diversity of completions. Lower values make output more deterministic; higher values can increase variety. Inference is the process of computing the completion from the prompt and model.
Related Concepts
Completion is the output of inference and LLM prompting. In RAG, the completion is constrained by retrieved context; in extraction, it is shaped by the schema.