Conversational AI over your documents.

Chat uses your knowledge bases to answer questions with citations. RAG retrieval finds relevant chunks and the model responds using your content. Integrate via API or use the hosted UI — answers stay grounded and traceable.

Start building now

Chat

Conversational AI over your documents.

Chat uses your knowledge bases to answer questions with citations. RAG retrieval finds relevant chunks and the model responds using your content, so answers stay grounded and traceable.

Integrate chat via API or use the hosted UI. Multi-document and multi-session conversations make it easy to add document Q&A to your product.

Read the full guide

Knowledge bases & retrieval

Add documents, embeddings, and hybrid search.

Add processed documents to a knowledge base via upload or pipeline. DocLD embeds chunks (e.g. llama-text-embed-v2) and indexes them in Pinecone. When you send a question, the API runs similarity search and retrieves top-k relevant chunks to ground the answer.

Each chat session is tied to a knowledge_base_id so responses stay scoped to your data. Use the same knowledge base across many sessions or create separate bases for different document sets.

Read the full guide

Response modes

Quick, deep, or code — tune for the question.

Choose a response mode per request: quick for fast, focused answers with minimal context; deep for comprehensive analysis with more sources; code for code-related questions in your docs. Each mode adjusts retrieval depth and generation style.

Pass mode in the request body when calling POST /api/v1/chat. Use quick for simple Q&A and deep when you need thorough, multi-source answers.

Read the full guide

Sessions & API

Create a session, then send messages; stream-only response.

Create a session with POST /api/v1/chat/sessions and knowledge_base_id; you get a session_id. Send messages with POST /api/v1/chat using that session_id and the same knowledge_base_id. The API expects a single message (AI SDK UIMessage format) and returns a streamed UIMessage response (SSE).

Get conversation history with GET /api/v1/chat?session_id=.... List, get, patch, and delete sessions via the sessions endpoints. All chat responses are stream-only for real-time UX.

Read the full guide

Citations & tools

Source citations, spreadsheet Q&A, charts, and more.

Every answer can include source citations so users see which passages support the response. The assistant can also query spreadsheets (e.g. "top 10 by price"), generate bar/line/pie charts from data, summarize a cited chunk on request, and search only within documents from a previous answer (scoped search).

Shareable links let you expose a read-only view of a conversation. Tools such as search_documents, query_spreadsheet, and share_link appear as tool-output-available chunks in the stream.

Read the full guide

How it works

Chat follows a RAG pipeline: your question is used to search the knowledge base, relevant chunks are retrieved, and the model generates an answer grounded in those sources. Citations link the response back to your documents.

Step	Description
Knowledge base	Add processed documents to a knowledge base; chunks are embedded and indexed.
Query	User asks a question in natural language.
Search	Query is embedded and similarity search runs (e.g. Pinecone).
Retrieve	Top-k relevant chunks are fetched (count depends on mode).
Generate	LLM produces an answer using the retrieved context.
Cite	Sources are linked to the response so users can verify.

Response modes

Choose a mode per request to balance speed and depth: quick for simple questions, deep for comprehensive analysis, code for code-related content in your docs.

Mode	Description	Best for
quick	Fast, focused responses with minimal context	Simple questions, quick facts
deep	Comprehensive analysis with multiple sources	Complex questions, detailed explanations
code	Optimized for finding and explaining code	Code in docs, APIs, snippets

API overview

Create a session with a knowledge base, then send messages. Chat responses are stream-only (AI SDK UIMessage over SSE). Get history and manage sessions via the endpoints below.

Endpoint	Description
POST /api/v1/chat/sessions	Create session (pass knowledge_base_id); returns session_id
POST /api/v1/chat	Send message (message, session_id, knowledge_base_id); streamed UIMessage response
GET /api/v1/chat?session_id=...	Get conversation history for a session
GET /api/v1/chat/sessions	List sessions (optional knowledge_base_id filter)
GET /api/v1/chat/sessions/:id	Get session details
PATCH /api/v1/chat/sessions/:id	Update session (e.g. title)
DELETE /api/v1/chat/sessions/:id	Delete session

Chat API reference

Share & export

Share a conversation with a link (optional approval flow) or export the session for backup, compliance, or integration.

Action	Endpoint
Share session	POST /api/v1/chat/sessions/:id/share
Revoke share	DELETE /api/v1/chat/sessions/:id/share
Export (JSON)	GET /api/v1/chat/sessions/:id/export?format=json
Export (Markdown)	GET /api/v1/chat/sessions/:id/export?format=markdown
Export (HTML)	GET /api/v1/chat/sessions/:id/export?format=html

Chat: Questions & Answers

Ready to add chat to your documents?

Get started with a knowledge base and the Chat API in minutes. Sign up for free or read the full API reference for sessions, streaming, and tools.

Get started free Chat API reference

How it works

Step	Description
Knowledge base	Add processed documents to a knowledge base; chunks are embedded and indexed.
Query	User asks a question in natural language.
Search	Query is embedded and similarity search runs (e.g. Pinecone).
Retrieve	Top-k relevant chunks are fetched (count depends on mode).
Generate	LLM produces an answer using the retrieved context.
Cite	Sources are linked to the response so users can verify.

Response modes

Choose a mode per request to balance speed and depth: quick for simple questions, deep for comprehensive analysis, code for code-related content in your docs.

Mode	Description	Best for
quick	Fast, focused responses with minimal context	Simple questions, quick facts
deep	Comprehensive analysis with multiple sources	Complex questions, detailed explanations
code	Optimized for finding and explaining code	Code in docs, APIs, snippets

API overview

Create a session with a knowledge base, then send messages. Chat responses are stream-only (AI SDK UIMessage over SSE). Get history and manage sessions via the endpoints below.

Endpoint	Description
POST /api/v1/chat/sessions	Create session (pass knowledge_base_id); returns session_id
POST /api/v1/chat	Send message (message, session_id, knowledge_base_id); streamed UIMessage response
GET /api/v1/chat?session_id=...	Get conversation history for a session
GET /api/v1/chat/sessions	List sessions (optional knowledge_base_id filter)
GET /api/v1/chat/sessions/:id	Get session details
PATCH /api/v1/chat/sessions/:id	Update session (e.g. title)
DELETE /api/v1/chat/sessions/:id	Delete session

Chat API reference