Commands
The DocLD CLI provides three main commands: parse, extract, and edit. Global options: --help, --version.
Parse
Convert documents into structured markdown output. See the Parse API and Document Parsing for API and format details.
# Parse a single file
docld parse document.pdf
# Parse an entire folder
docld parse ./documents
# Parse with AI enhancement (more accurate but slower)
docld parse document.pdf --agentic
# Include document metadata
docld parse document.pdf --hyperlinks --comments --highlightsOutput: Creates <filename>.parse.md files with YAML frontmatter (job ID, page count, studio link) and structured markdown content.
Parse options
| Flag | Description |
|---|---|
--agentic | Enable AI enhancement for text, tables, and figures |
--change-tracking | Enable change tracking for document revisions |
--hyperlinks | Include hyperlinks in output |
--comments | Include document comments in output |
--highlights | Include highlighted text in output |
-o, --output <dir> | Output directory (default: same as input) |
Extract
Extract structured data from documents using JSON schemas. See the Extract API, Data Extraction, and Custom Extraction for schemas and field types.
# Extract with a schema file
docld extract invoice.pdf -s schemas/invoice.json
# Extract from multiple files
docld extract ./invoices -s schemas/invoice.json
# Include source citations
docld extract invoice.pdf -s schema.json --citationsOutput: Creates <filename>.extract.json files. Extraction reuses existing .parse.md files when available to speed up processing.
Extract options
| Flag | Description |
|---|---|
-s, --schema <path> | Path to JSON schema file (required) |
--citations | Include source citations in output |
-o, --output <dir> | Output directory |
Schema format
Schemas must be valid JSON Schema with type: "object":
{
"type": "object",
"properties": {
"invoice_number": { "type": "string" },
"total_amount": { "type": "number" },
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": { "type": "string" },
"quantity": { "type": "number" },
"price": { "type": "number" }
}
}
}
},
"required": ["invoice_number", "total_amount"]
}Edit
Modify documents with natural language instructions. Supports .pdf and .docx only. See the Edit API and Edit feature for programmatic access and concepts.
# Fill a form
docld edit form.pdf -i "Fill the client name as 'Acme Corp' and date as 'January 15, 2024'"
# Edit multiple documents
docld edit ./contracts -i "Replace 'OLD COMPANY' with 'NEW COMPANY' throughout"Output: Creates <filename>.edited.<ext> files with the modifications applied.
Edit options
| Flag | Description |
|---|---|
-i, --instructions <text> | Natural language editing instructions (required) |
-o, --output <dir> | Output directory |
Examples
Invoice processing pipeline
docld parse ./invoices --agentic
docld extract ./invoices -s schemas/invoice.json
# Results in .extract.json filesContract editing
docld edit ./contracts -i "Replace 'Old Corp' with 'New Corp' throughout the document"Batch processing
docld parse ./documents -o ./parsed
docld extract ./documents -s schema.json -o ./extractedSee also
- CLI overview — Installation and configuration
- Parse API — Programmatic parsing
- Extract API — Programmatic extraction
- Edit API — Programmatic edit and batch