RAG Chat
Chat with your documents using retrieval-augmented generation (RAG).
How It Works
Question → Search (Pinecone) → Retrieve → Generate → Respond with Citations- Knowledge Base — Add processed documents to a knowledge base
- Query — Ask a question in natural language
- Search — Query is sent to Pinecone; embeddings and similarity search happen server-side (llama-text-embed-v2)
- Retrieve — Get top-k relevant chunks
- Generate — LLM creates answer using context
- Cite — Sources are linked to the response
The assistant can also query spreadsheets (e.g. “top 10 by price”), summarize cited chunks, search only within documents from a previous answer, create charts (bar, line, pie), and generate shareable links for the conversation.
Features
| Feature | Description |
|---|---|
| Context-aware | Answers grounded in your documents |
| Source citations | See which passages support each answer |
| Confidence scores | Know how reliable the answer is |
| Session history | Continue conversations across messages |
| Streaming | Real-time response streaming |
| Multi-language | Ask questions in any language |
| Export | Download chat sessions |
| Sharing | Share sessions with others |
| Spreadsheet Q&A | Query spreadsheets with sort/limit (e.g. cheapest, top N by column) |
| Charts | Display bar, line, or pie charts from data |
| Summarize citation | Summarize a cited document or chunk on request |
| Scoped search | Search only within documents from a previous answer |
| Share link | Generate a shareable link for the conversation (with optional approval) |
Getting Started
1. Create a Knowledge Base
curl -X POST "/api/knowledge-bases" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{"name": "Company Policies"}'2. Add Documents
curl -X POST "/api/upload" \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@handbook.pdf" \
-F "knowledge_base_id=kb-uuid"3. Chat
Create a session first (POST /api/chat/sessions with knowledge_base_id), then send a message. The API expects a single message object with role and parts (AI SDK UIMessage format):
curl -X POST "/api/chat" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"message": { "role": "user", "parts": [{ "type": "text", "text": "What is the vacation policy?" }] },
"session_id": "session-uuid",
"knowledge_base_id": "kb-uuid"
}'See the Chat API for the full request shape and streaming response.
Response Format
The chat endpoint is stream-only. The response is an AI SDK UIMessage stream (SSE): you will see text-delta chunks for the assistant’s reply and tool-output-available chunks when tools (e.g. search_documents, query_spreadsheet) return context and citations. For a non-streaming style response, use the Embed API with action: 'chat'; see the Chat API for full stream and tool details.
Response Modes
| Mode | Description | Best For |
|---|---|---|
quick | Fast response, fewer chunks | Simple questions |
deep | Deeper search, more context | Complex questions |
code | Optimized for code-related questions | Code in docs |
curl -X POST "/api/chat" \
-d '{
"message": { "role": "user", "parts": [{ "type": "text", "text": "Explain all the benefit options in detail" }] },
"session_id": "session-uuid",
"knowledge_base_id": "kb-uuid",
"mode": "deep"
}'Sessions
Session Management
Conversations are organized into sessions:
# List sessions
curl -X GET "/api/chat/sessions?knowledge_base_id=kb-uuid"
# Get session details
curl -X GET "/api/chat/sessions/{session_id}"
# Update session title
curl -X PATCH "/api/chat/sessions/{session_id}" \
-d '{"title": "Benefits Discussion"}'
# Delete session
curl -X DELETE "/api/chat/sessions/{session_id}"Continuing Conversations
Pass session_id to continue a conversation:
curl -X POST "/api/chat" \
-d '{
"message": { "role": "user", "parts": [{ "type": "text", "text": "What about sick leave?" }] },
"session_id": "existing-session-uuid",
"knowledge_base_id": "kb-uuid"
}'Sharing
Enable Sharing
curl -X POST "/api/chat/sessions/{id}/share"Response includes a shareable URL:
{
"share_url": "https://your-domain.com/chat/share/token123"
}Disable Sharing
curl -X DELETE "/api/chat/sessions/{id}/share"Export
Export sessions in various formats:
# JSON
curl -X GET "/api/chat/sessions/{id}/export?format=json"
# Markdown
curl -X GET "/api/chat/sessions/{id}/export?format=markdown"
# HTML
curl -X GET "/api/chat/sessions/{id}/export?format=html"Multi-Language Support
Ask questions in any language. Use the same UIMessage shape and include session_id and knowledge_base_id; optionally set language (e.g. fr) for response language:
curl -X POST "/api/chat" \
-d '{
"message": { "role": "user", "parts": [{ "type": "text", "text": "Quelle est la politique de vacances?" }] },
"session_id": "session-uuid",
"knowledge_base_id": "kb-uuid",
"language": "fr"
}'The response will be in the specified language when supported.
Streaming Example (JavaScript)
The endpoint is stream-only and returns an AI SDK UIMessage stream (e.g. text-delta, tool-output-available). Create a session first, then send a message with the UIMessage shape:
// 1. Create session
const sessionRes = await fetch('/api/chat/sessions', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({ knowledge_base_id: 'kb-uuid' })
});
const { session_id } = await sessionRes.json();
// 2. Send message and read stream
const response = await fetch('/api/chat', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
message: { role: 'user', parts: [{ type: 'text', text: 'What are the payment terms?' }] },
session_id,
knowledge_base_id: 'kb-uuid'
})
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
// Parse SSE lines (data: {...}) and handle text-delta / tool-output-available
}Best Practices
- Curate knowledge bases — Include relevant, quality documents
- Be specific — Clear questions get better answers
- Check citations — Verify answers against sources
- Use sessions — Continue conversations for follow-ups
- Organize by topic — Separate knowledge bases by domain
- Review confidence — Low scores may need human review
Credit Usage
| Operation | Credits |
|---|---|
| Chat query | 0.1 per query |
API Reference
See the Chat API for complete endpoint documentation.