Upload API
Upload documents to DocLD for processing. Supports direct upload and presigned URLs for large files. After upload, use the Parse API to process documents or the Documents API to manage them. For a single request/response flow, use the Embed API.
Upload Document
POST /api/uploadUpload a document for processing. The document is stored and automatically queued for parsing, chunking, and vectorization.
Request
Content-Type: multipart/form-data
| Field | Type | Required | Description |
|---|---|---|---|
file | File | Yes | The document file (max 100MB) |
parsing_config | JSON string | No | Parsing configuration |
knowledge_base_id | string | No | Add to knowledge base after processing |
organization_id | string | No | Organization to associate document with |
Parsing Configuration
{
"chunking": {
"strategy": "semantic",
"max_chunk_size": 1000,
"overlap": 100
},
"ocr": {
"enabled": true,
"language": "auto"
}
}Response
{
"file_id": "docld://abc123-def456",
"document": {
"id": "abc123-def456",
"name": "invoice.pdf",
"status": "pending",
"file_type": "pdf",
"file_format": "pdf",
"parsing_config": {}
},
"filename": "invoice.pdf",
"size": 125000,
"mime_type": "application/pdf",
"queue": {
"queued": true,
"error": null
},
"usage": {
"num_pages": 0,
"credits": 0
}
}Example
curl -X POST "https://your-domain.com/api/upload" \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@invoice.pdf" \
-F "knowledge_base_id=kb-123"Supported Formats
| Category | Formats |
|---|---|
| Documents | .pdf, .docx, .doc, .txt, .html, .rtf |
| Images | .png, .jpg, .jpeg, .gif, .bmp, .tiff |
| Spreadsheets | .csv, .xlsx, .xls, .xlsm, .xltx, .xltm, .qpw |
| Presentations | .pptx, .ppt (legacy .ppt uses OCR for text) |
Presigned Upload
POST /api/upload/presignedGet a presigned URL for direct upload to storage. Recommended for files larger than 10MB, supports up to 5GB.
Request Body
{
"filename": "large-document.pdf",
"content_type": "application/pdf",
"file_size": 500000000
}| Field | Type | Required | Description |
|---|---|---|---|
filename | string | Yes | Original filename |
content_type | string | No | MIME type (auto-detected if not provided) |
file_size | number | No | File size in bytes (max 5GB) |
Response
{
"upload_url": "https://storage.example.com/presigned-url...",
"file_id": "docld://abc123-def456",
"expires_at": "2024-01-15T11:30:00Z",
"method": "PUT",
"headers": {
"Content-Type": "application/pdf"
}
}Usage Flow
- Request presigned URL
- Upload file directly to the returned URL using PUT
- Use the
file_idfor subsequent operations (parse, extract)
Example
Step 1: Get presigned URL
curl -X POST "https://your-domain.com/api/upload/presigned" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"filename": "large-doc.pdf", "content_type": "application/pdf"}'Step 2: Upload to presigned URL
curl -X PUT "https://storage.example.com/presigned-url..." \
-H "Content-Type: application/pdf" \
--data-binary @large-doc.pdfStep 3: Parse the uploaded document
curl -X POST "https://your-domain.com/api/parse" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"input": "docld://abc123-def456"}'File References
After uploading, documents can be referenced using the docld:// protocol:
| Reference | Description |
|---|---|
docld://{document_id} | Reference a document by ID |
These references can be used in:
- Parse API — Parse a previously uploaded document
- Extract API — Extract data from a document
- Split API — Split a document into sections
- Edit API — Edit/fill a document
Example
{
"input": "docld://abc123-def456"
}Last updated on