Query
ConceptsA query is the user's question or search text. In DocLD, queries drive RAG chat and semantic search: the query is embedded with the same model used for document chunks, then sent to vector search to retrieve the most similar chunks. Those chunks become context for the LLM to generate an answer.
How Queries Flow Through DocLD
- User asks — A question is submitted (e.g., "What is the total amount on the invoice?")
- Embed — The query is embedded using the same model as document chunks
- Search — Vector search returns the top-k most similar chunks
- Rerank — Optional reranking refines the order
- Generate — The LLM uses the chunks as context and produces an answer with citations
Query quality affects retrieval quality. Clear, specific questions tend to retrieve more relevant chunks than vague or overly broad ones.
Query Best Practices
| Tip | Description |
|---|---|
| Be specific | "What is the invoice total for Q3?" retrieves better than "totals" |
| Use natural language | Vector search matches meaning, not just keywords |
| Multi-turn | Follow-up questions can refer to prior context in sessions |
| Scope | Knowledge base selection scopes which documents are searched |
Query Types
- Chat query — A question in a RAG chat; returns an answer with citations
- Search query — A semantic search; returns ranked chunks without generation
- Extraction query — Not a query per se; extraction uses documents + schema, not user questions
Related Concepts
Queries drive vector search and RAG. Top-k controls how many chunks are retrieved. Embedding converts the query to a vector. Reranking improves relevance of retrieved chunks.