Read documents with human-like accuracy.
Extract text, tables, and structure from PDFs, images, spreadsheets, presentations, and Word documents. Layout-aware parsing, OCR in 50+ languages, sync and async API, and optional agentic mode for complex layouts — no local dependencies.



Parse
Read documents with human-like accuracy.
DocLD parses PDFs, images, spreadsheets, presentations, and Word documents with layout-aware extraction and optional OCR for scanned content. Text, tables, and structure are preserved so you can build search, extraction, and chat on a solid foundation.
Supported formats include PDF, PNG, JPG, XLSX, PPTX, DOCX, and more. Parsing runs in the cloud with results returned via API — no local dependencies required.
Supported formats
One API for the formats you use every day.
Parse handles PDF (native text and OCR for scanned), images (PNG, JPG, TIFF, etc.) with VLM-based OCR, spreadsheets (CSV, XLSX, XLS), presentations (PPTX, PPT), and documents (DOCX, TXT, HTML, RTF). Each format uses the right extractor so you get accurate text and structure without format-specific integrations.
Upload files up to 100MB, pass a URL, or reference a stored DocLD document to re-parse without re-uploading.






OCR & layout
50+ languages, tables, and layout preserved.
For images and scanned pages, Parse uses VLM-based OCR with 50+ languages, auto-detection, handwriting support, and table extraction. Layout and bounding boxes are preserved so downstream tools get structured, page-aware content.
Enable agentic mode for better table reconstruction, figure and chart analysis, and complex form parsing — at higher accuracy and slightly slower speed.
Chunking
Output ready for RAG and LLM context.
Parse outputs chunks tuned for retrieval and LLM context. Use semantic chunking (default) to split by meaning and natural boundaries, fixed-size for character-based control, or page-based for one chunk per page. Configure max chunk size and overlap to tune RAG quality.
Chunks include content, page range, and metadata so you can index them in your knowledge base or feed them directly to extraction and chat.






Sync & async API
Immediate results or webhooks for batch jobs.
Use synchronous parse for small documents and immediate results in the response body. Use asynchronous parse for large files or batch processing — you get a job ID and can poll the status URL or receive a webhook when done.
Save configurations as parse presets (user or organization scope) and reference them in requests. No need to repeat chunking or table-format options on every call.
Output & tools
Tables in markdown, HTML, or JSON. CLI and public endpoints.
Choose table output format: markdown, HTML, or JSON. Parse returns structured chunks with content, page, and metadata. Public endpoints (pdf-to-text, pdf-to-markdown, pdf-to-json) offer rate-limited, anonymous-friendly conversion; authenticated users get full documents.
Use the DocLD CLI to parse from the command line: docld parse <file> with optional --agentic and -o for output directory. Credits: 1.5 per page standard, 3.0 per page for agentic OCR.



Supported formats
| Format | Extensions | Notes |
|---|---|---|
| Native parsing; OCR for scanned PDFs | ||
| Images | .png, .jpg, .jpeg, .gif, .bmp, .tiff | VLM-based OCR |
| Spreadsheets | .csv, .xlsx, .xls, .xlsm, .xltx, .xltm, .qpw | Structured extraction |
| Presentations | .pptx, .ppt | Slide content. Legacy .ppt has no native text; OCR or converter recommended. |
| Documents | .docx, .doc, .txt, .html, .rtf | Direct text extraction |
Input options
| Input type | Format | Description |
|---|---|---|
| File upload | multipart/form-data | Upload file directly (max 100MB) |
| URL | {"input": "https://..."} | Fetch document from URL |
| DocLD reference | {"input": "docld://..."} | Parse previously uploaded document |
OCR capabilities
| Feature | Description |
|---|---|
| Multi-language | 50+ languages supported |
| Auto-detection | Automatically detects document language |
| Handwriting | Recognizes handwritten text |
| Table extraction | Preserves table structure |
| Layout preservation | Maintains document layout |
| Bounding boxes | Returns coordinates for each text block |
Chunking strategies
| Strategy | Description |
|---|---|
| semantic | Splits by meaning and natural boundaries; respects paragraphs and headings (default). |
| fixed | Splits by character count with configurable max_chunk_size and overlap. |
| page | One chunk per page. |
Configuration options
config object: table output (markdown, html, json), chunking strategy, chunk size, and overlap. Pipeline processing supports OCR and agentic options.| Option | Default | Description |
|---|---|---|
| formatting.table_output_format | markdown | Table format: markdown, html, json |
| chunking.strategy | semantic | semantic, fixed, or page |
| chunking.max_chunk_size | 1000 | Maximum characters per chunk |
| chunking.overlap | 100 | Character overlap between chunks |
Sync vs async API
| Aspect | Sync (POST /api/parse) | Async (POST /api/parse/async) |
|---|---|---|
| Use case | Small docs, immediate results | Large docs, batch processing |
| Response | Result in response body | job_id, status_url; result via webhook or poll |
| Webhooks | — | Optional webhook_url for completion callback |
Credit usage
| Operation | Credits per page |
|---|---|
| Standard parse | 1.5 |
| Agentic OCR | 3.0 |
Parse presets, public endpoints & CLI
Parse presets
GET /api/parse/presets— List presetsPOST /api/parse/presets— Create presetGET /api/parse/presets/:id— Get presetPATCH /api/parse/presets/:id— Update presetDELETE /api/parse/presets/:id— Delete preset
Public endpoints
POST /api/pdf-to-text— PDF to plain textPOST /api/pdf-to-markdown— PDF to structured MarkdownPOST /api/pdf-to-json— PDF to JSON (pages, blocks, tables)
CLI
docld parse <file> (e.g. docld parse document.pdf). Use --agentic for agentic mode and -o for output directory when parsing folders.Parse: Questions & Answers
Ready to parse your documents?
Get started with the Parse API in minutes. Sign up for free or read the full API reference for request formats, webhooks, and presets.