Read documents with human-like accuracy.

Extract text, tables, and structure from PDFs, images, spreadsheets, presentations, and Word documents. Layout-aware parsing, OCR in 50+ languages, sync and async API, and optional agentic mode for complex layouts — no local dependencies.

Start building now

Parse

Read documents with human-like accuracy.

DocLD parses PDFs, images, spreadsheets, presentations, and Word documents with layout-aware extraction and optional OCR for scanned content. Text, tables, and structure are preserved so you can build search, extraction, and chat on a solid foundation.

Supported formats include PDF, PNG, JPG, XLSX, PPTX, DOCX, and more. Parsing runs in the cloud with results returned via API — no local dependencies required.

Read the full guide

Supported formats

One API for the formats you use every day.

Parse handles PDF (native text and OCR for scanned), images (PNG, JPG, TIFF, etc.) with VLM-based OCR, spreadsheets (CSV, XLSX, XLS), presentations (PPTX, PPT), and documents (DOCX, TXT, HTML, RTF). Each format uses the right extractor so you get accurate text and structure without format-specific integrations.

Upload files up to 100MB, pass a URL, or reference a stored DocLD document to re-parse without re-uploading.

Read the full guide

OCR & layout

50+ languages, tables, and layout preserved.

For images and scanned pages, Parse uses VLM-based OCR with 50+ languages, auto-detection, handwriting support, and table extraction. Layout and bounding boxes are preserved so downstream tools get structured, page-aware content.

Enable agentic mode for better table reconstruction, figure and chart analysis, and complex form parsing — at higher accuracy and slightly slower speed.

Read the full guide

Chunking

Output ready for RAG and LLM context.

Parse outputs chunks tuned for retrieval and LLM context. Use semantic chunking (default) to split by meaning and natural boundaries, fixed-size for character-based control, or page-based for one chunk per page. Configure max chunk size and overlap to tune RAG quality.

Chunks include content, page range, and metadata so you can index them in your knowledge base or feed them directly to extraction and chat.

Read the full guide

Sync & async API

Immediate results or webhooks for batch jobs.

Use synchronous parse for small documents and immediate results in the response body. Use asynchronous parse for large files or batch processing — you get a job ID and can poll the status URL or receive a webhook when done.

Save configurations as parse presets (user or organization scope) and reference them in requests. No need to repeat chunking or table-format options on every call.

Read the full guide

Output & tools

Tables in markdown, HTML, or JSON. CLI and public endpoints.

Choose table output format: markdown, HTML, or JSON. Parse returns structured chunks with content, page, and metadata. Public endpoints (pdf-to-text, pdf-to-markdown, pdf-to-json) offer rate-limited, anonymous-friendly conversion; authenticated users get full documents.

Use the DocLD CLI to parse from the command line: docld parse <file> with optional --agentic and -o for output directory. Credits: 1.5 per page standard, 3.0 per page for agentic OCR.

Read the full guide

Supported formats

Parse extracts text and structure from the formats you use every day: native text extraction for PDFs and Office files, VLM-based OCR for images and scanned pages, and structured extraction for spreadsheets and presentations.

Format	Extensions	Notes
PDF	.pdf	Native parsing; OCR for scanned PDFs
Images	.png, .jpg, .jpeg, .gif, .bmp, .tiff	VLM-based OCR
Spreadsheets	.csv, .xlsx, .xls, .xlsm, .xltx, .xltm, .qpw	Structured extraction
Presentations	.pptx, .ppt	Slide content. Legacy .ppt has no native text; OCR or converter recommended.
Documents	.docx, .doc, .txt, .html, .rtf	Direct text extraction

Input options

Parse accepts files up to 100MB via direct upload, a public URL, or a DocLD document reference — so you can re-parse stored documents without re-uploading.

Input type	Format	Description
File upload	multipart/form-data	Upload file directly (max 100MB)
URL	{"input": "https://..."}	Fetch document from URL
DocLD reference	{"input": "docld://..."}	Parse previously uploaded document

OCR capabilities

For images and scanned pages, DocLD uses VLM-based OCR with support for 50+ languages, auto-detection, handwriting, table extraction, layout preservation, and bounding boxes. Enable agentic mode for better table reconstruction and complex form parsing.

Feature	Description
Multi-language	50+ languages supported
Auto-detection	Automatically detects document language
Handwriting	Recognizes handwritten text
Table extraction	Preserves table structure
Layout preservation	Maintains document layout
Bounding boxes	Returns coordinates for each text block

Chunking strategies

Parse outputs chunks ready for RAG and LLMs. Choose semantic (default), fixed-size, or page-based chunking; control size and overlap to tune retrieval quality.

Strategy	Description
semantic	Splits by meaning and natural boundaries; respects paragraphs and headings (default).
fixed	Splits by character count with configurable max_chunk_size and overlap.
page	One chunk per page.

Configuration options

Tune Parse via the config object: table output (markdown, html, json), chunking strategy, chunk size, and overlap. Pipeline processing supports OCR and agentic options.

Option	Default	Description
formatting.table_output_format	markdown	Table format: markdown, html, json
chunking.strategy	semantic	semantic, fixed, or page
chunking.max_chunk_size	1000	Maximum characters per chunk
chunking.overlap	100	Character overlap between chunks

Sync vs async API

Parse offers both sync and async under /api/v1 with a Bearer API key. The dashboard uses the same handlers via session routes POST /api/parse and POST /api/parse/async. Use sync for small documents and immediate results, or async for large files and batch jobs with webhook callbacks or status polling (GET /api/v1/jobs/:id).

Aspect	Sync (POST /api/v1/parse, API key)	Async (POST /api/v1/parse/async, API key)
Use case	Small docs, immediate results	Large docs, batch processing
Response	Result in response body	job_id, status_url; result via webhook or poll
Webhooks	—	Optional webhook_url for completion callback

Credit usage

Parse charges by page: 1.5 credits per page for standard parsing, 3.0 per page for agentic OCR. Use agentic for complex layouts and tables when you need higher accuracy.

Operation	Credits per page
Standard parse	1.5
Agentic OCR	3.0

Parse presets, public endpoints & CLI

Parse presets

Save and reuse parsing configurations. List, create, get, update, and delete presets via the API (user or organization scope).

GET /api/parse/presets — List presets
POST /api/parse/presets — Create preset
GET /api/parse/presets/:id — Get preset
PATCH /api/parse/presets/:id — Update preset
DELETE /api/parse/presets/:id — Delete preset

Parse API reference

Public endpoints

Rate-limited, anonymous-friendly endpoints for quick PDF conversion:

POST /api/tools/pdf-to-text — PDF to plain text
POST /api/tools/pdf-to-markdown — PDF to structured Markdown
POST /api/tools/pdf-to-json — PDF to JSON (pages, blocks, tables)

Anonymous users get 2 pages; authenticated users get full documents.

CLI

Parse documents from the command line with the DocLD CLI: docld parse <file> (e.g. docld parse document.pdf). Use --agentic for agentic mode and -o for output directory when parsing folders.

Parse: Questions & Answers

Ready to parse your documents?

Get started with the Parse API in minutes. Sign up for free or read the full API reference for request formats, webhooks, and presets.

Get started free Parse API reference

Read documents with human-like accuracy.

Start building now

Parse

Read documents with human-like accuracy.

Supported formats include PDF, PNG, JPG, XLSX, PPTX, DOCX, and more. Parsing runs in the cloud with results returned via API — no local dependencies required.

Read the full guide

Supported formats

One API for the formats you use every day.

Upload files up to 100MB, pass a URL, or reference a stored DocLD document to re-parse without re-uploading.

Read the full guide

OCR & layout

50+ languages, tables, and layout preserved.

Enable agentic mode for better table reconstruction, figure and chart analysis, and complex form parsing — at higher accuracy and slightly slower speed.

Read the full guide

Chunking

Output ready for RAG and LLM context.

Chunks include content, page range, and metadata so you can index them in your knowledge base or feed them directly to extraction and chat.

Read the full guide

Sync & async API

Immediate results or webhooks for batch jobs.

Save configurations as parse presets (user or organization scope) and reference them in requests. No need to repeat chunking or table-format options on every call.

Read the full guide

Output & tools

Tables in markdown, HTML, or JSON. CLI and public endpoints.

Use the DocLD CLI to parse from the command line: docld parse <file> with optional --agentic and -o for output directory. Credits: 1.5 per page standard, 3.0 per page for agentic OCR.

Read the full guide

Supported formats

Format	Extensions	Notes
PDF	.pdf	Native parsing; OCR for scanned PDFs
Images	.png, .jpg, .jpeg, .gif, .bmp, .tiff	VLM-based OCR
Spreadsheets	.csv, .xlsx, .xls, .xlsm, .xltx, .xltm, .qpw	Structured extraction
Presentations	.pptx, .ppt	Slide content. Legacy .ppt has no native text; OCR or converter recommended.
Documents	.docx, .doc, .txt, .html, .rtf	Direct text extraction

Input options

Parse accepts files up to 100MB via direct upload, a public URL, or a DocLD document reference — so you can re-parse stored documents without re-uploading.

Input type	Format	Description
File upload	multipart/form-data	Upload file directly (max 100MB)
URL	{"input": "https://..."}	Fetch document from URL
DocLD reference	{"input": "docld://..."}	Parse previously uploaded document

OCR capabilities

Feature	Description
Multi-language	50+ languages supported
Auto-detection	Automatically detects document language
Handwriting	Recognizes handwritten text
Table extraction	Preserves table structure
Layout preservation	Maintains document layout
Bounding boxes	Returns coordinates for each text block

Chunking strategies

Parse outputs chunks ready for RAG and LLMs. Choose semantic (default), fixed-size, or page-based chunking; control size and overlap to tune retrieval quality.

Strategy	Description
semantic	Splits by meaning and natural boundaries; respects paragraphs and headings (default).
fixed	Splits by character count with configurable max_chunk_size and overlap.
page	One chunk per page.

Configuration options

Tune Parse via the config object: table output (markdown, html, json), chunking strategy, chunk size, and overlap. Pipeline processing supports OCR and agentic options.

Option	Default	Description
formatting.table_output_format	markdown	Table format: markdown, html, json
chunking.strategy	semantic	semantic, fixed, or page
chunking.max_chunk_size	1000	Maximum characters per chunk
chunking.overlap	100	Character overlap between chunks

Sync vs async API

Aspect	Sync (POST /api/v1/parse, API key)	Async (POST /api/v1/parse/async, API key)
Use case	Small docs, immediate results	Large docs, batch processing
Response	Result in response body	job_id, status_url; result via webhook or poll
Webhooks	—	Optional webhook_url for completion callback

Credit usage

Parse charges by page: 1.5 credits per page for standard parsing, 3.0 per page for agentic OCR. Use agentic for complex layouts and tables when you need higher accuracy.

Operation	Credits per page
Standard parse	1.5
Agentic OCR	3.0

Parse presets, public endpoints & CLI

Parse presets

Save and reuse parsing configurations. List, create, get, update, and delete presets via the API (user or organization scope).

GET /api/parse/presets — List presets
POST /api/parse/presets — Create preset
GET /api/parse/presets/:id — Get preset
PATCH /api/parse/presets/:id — Update preset
DELETE /api/parse/presets/:id — Delete preset

Parse API reference

Public endpoints

Rate-limited, anonymous-friendly endpoints for quick PDF conversion:

POST /api/tools/pdf-to-text — PDF to plain text
POST /api/tools/pdf-to-markdown — PDF to structured Markdown
POST /api/tools/pdf-to-json — PDF to JSON (pages, blocks, tables)

Anonymous users get 2 pages; authenticated users get full documents.

CLI

Parse documents from the command line with the DocLD CLI: docld parse <file> (e.g. docld parse document.pdf). Use --agentic for agentic mode and -o for output directory when parsing folders.

Parse: Questions & Answers

Ready to parse your documents?

Get started with the Parse API in minutes. Sign up for free or read the full API reference for request formats, webhooks, and presets.

Get started free Parse API reference