Top-p
AITop-p (nucleus sampling) is a parameter that controls LLM sampling. The model considers only the smallest set of next-token candidates whose cumulative probability is at least p (e.g., 0.9). This trims low-probability tails and can make output more focused while allowing some diversity.
vs Temperature
| Parameter | Effect |
|---|---|
| Temperature | Scales logits; lower = sharper distribution, more deterministic |
| Top-p | Caps the candidate set by cumulative probability; can cut off long tails |
Both can be used together. DocLD applies appropriate sampling for extraction and RAG to balance consistency and naturalness. Top-k is a related idea (limit to top k tokens by probability).
Related Concepts
Top-p affects completion and inference like temperature. Top-k is used in retrieval (number of results) and sometimes in sampling (number of token candidates).