RAG
AIRAG (Retrieval-Augmented Generation) combines document retrieval with AI generation. Instead of relying only on the model's training, RAG finds relevant passages from your documents and uses them as context to generate accurate, citation-backed answers.
How RAG Works in DocLD
Question → Search (Pinecone) → Retrieve chunks → Generate answer → Respond with citations
- Query — User asks a question in natural language
- Search — Query is embedded and sent to Pinecone; vector search returns the most similar chunks
- Rerank — Optional reranking improves relevance
- Generate — LLM creates an answer using only the retrieved chunks as context
- Cite — Response includes citations to source passages
Benefits
- Accurate — Answers grounded in your documents, not model guesswork
- Transparent — Citations show where information came from
- Up-to-date — Reflects your latest content without retraining
Related Concepts
RAG depends on embedding, chunking, and vector search. Documents are organized in knowledge bases. Chat uses sessions for conversation history.