Python SDK
The official DocLD SDK for Python. Published as docld on PyPI. Requires Python 3.9+ and uses requests for HTTP.
Installation
pip install docldConfiguration
Create a client with your API key and optional base URL:
from docld import DocLD
client = DocLD(
api_key="docld_...",
base_url="https://api.docld.com", # optional
)| Argument | Description |
|---|---|
api_key | Your DocLD API key. Create one in Settings → API Keys. Can be omitted if DOCLD_API_KEY is set. |
base_url | API base URL. Defaults to https://api.docld.com or the DOCLD_BASE_URL environment variable (for self-hosted or custom deployments). |
Quick start
from docld import DocLD
client = DocLD(api_key="docld_...")
# Parse a document from a URL
result = client.parse.parse("https://example.com/document.pdf")
print(result["result"]["chunks"])
# List documents
resp = client.documents.list(limit=10)
print(resp["documents"], resp["total"])API reference
Upload
client.upload.upload(file, parsing_config=None, knowledge_base_id=None, organization_id=None, classification=None)
Upload a document. Returns a dict with upload metadata and document record.
file — One of:
- Path string — Path to a file on disk (e.g.
"document.pdf"). - bytes — Raw file bytes.
- File-like object — Any object with a
.read()method (and optionally.name).
Keyword arguments:
| Argument | Type | Description |
|---|---|---|
parsing_config | dict | None | Parsing options (e.g. enhance, settings). |
knowledge_base_id | str | None | Add the document to this knowledge base. |
organization_id | str | None | Scope to this organization. |
classification | str | None | One of: internal, confidential, pii, restricted, public. |
Returns: dict with:
file_id— e.g.docld://<documentId>for use in parse/extract.document—{ id, name, status, file_type, file_format, parsing_config? }.filename,size,mime_type.queue—{ queued, error }.usage—{ num_pages, credits }.
Parse
client.parse.parse(input, config=None)
Parse a document. Input can be a URL, a docld:// reference, a file path, bytes, or a file-like object.
input — One of:
- URL string —
https://...,http://..., ordocld://<documentId>. - Path string — Path to a local file.
- bytes — Raw file content.
- File-like object — Object with
.read()(and optionally.name).
config (optional) — dict with parsing options, e.g.:
enhance— AI enhancement options.settings— e.g.change_tracking.formatting— e.g.include_hyperlinks,include_comments,include_highlights.
Returns: dict with:
job_id,duration,usage(num_pages,credits).result—{ type: 'full', chunks: [...] }(each chunk hascontent,blocks, optionalmetadata).studio_link— Link to view in DocLD Studio.
Documents
client.documents.list(search=None, status=None, file_type=None, file_format=None, knowledge_base_id=None, tags=None, date_from=None, date_to=None, sort_by=None, sort_order=None, limit=None, offset=None)
List documents with optional filters and pagination. All arguments are optional keyword arguments.
| Argument | Type | Description |
|---|---|---|
search | str | None | Filter by name or content. |
status | str | list[str] | None | Filter by status. |
file_type | str | list[str] | None | Filter by file type. |
file_format | str | list[str] | None | Filter by format. |
knowledge_base_id | str | None | Only documents in this knowledge base. |
tags | list[str] | None | Filter by tags. |
date_from | str | None | Created on or after (ISO date). |
date_to | str | None | Created on or before (ISO date). |
sort_by | str | None | One of: name, created_at, file_size, status, updated_at. |
sort_order | str | None | asc or desc. |
limit | int | None | Page size (default 50, max 100). |
offset | int | None | Pagination offset. |
Returns: dict with documents, total, limit, offset, has_more.
client.documents.get(id) — Get a single document by ID. Returns a document dict (includes file_url as signed URL when applicable).
client.documents.delete(id) — Delete a document. Returns None.
client.documents.get_parse(id) — Get parsed content for a document. Returns parsed result object.
client.documents.get_status(id) — Get processing status for a document. Returns status object.
Extract
client.extract.run(input=None, document_id=None, schema_id=None, config=None, description=None, include_citations=False, variant_label=None, experiment_id=None)
Run extraction on a document. Provide exactly one of: input or document_id, and one of: schema_id, config, or description.
Keyword arguments:
| Argument | Type | Description |
|---|---|---|
input | str | None | Document reference: URL, docld://<id>, or jobid://<jobId>. |
document_id | str | None | Legacy: document ID. |
schema_id | str | None | ID of a saved schema (from schemas.list() or schemas.get()). |
config | dict | None | Inline extraction config (fields, instructions). |
description | str | None | Natural language description for zero-shot extraction. |
include_citations | bool | Include per-field citations. |
variant_label | str | None | Experiment variant (when using experiments). |
experiment_id | str | None | Run within an experiment. |
Returns: dict with:
success,job_id,extraction_id?,document_id.data— Extracted key-value object (ornull).field_results?— Per-field value, confidence, and optional citations.overall_confidence?,processing_time?,usage?,error?.
client.extract.schemas.list(organization=False, organization_only=False, organization_id=None) — List extraction schemas. Returns dict with schemas list.
client.extract.schemas.get(id) — Get a schema by ID. Returns schema dict.
client.extract.schemas.create(name, description='', instructions='Be precise and thorough.', fields=None, settings=None, ground_truth_template=None, organization_id=None, is_shared=False, version=None) — Create a schema. Returns schema dict. fields defaults to []; settings defaults to {"includeImages": False, "visionScope": "all"}.
Errors
Failed requests raise DocLDError (subclass of Exception) with:
message— Human-readable message.code— Optional error code (e.g.NOT_FOUND,VALIDATION_ERROR,UNAUTHORIZED).details— Optional dict with extra context.errors— Optional list of{ field, message }for validation errors.
from docld import DocLD, DocLDError
try:
client.documents.get("invalid-id")
except DocLDError as e:
print(e.code, e.message, e.details)
raiseSee also
- API Reference — Full REST endpoint documentation.
- JavaScript SDK — Official Node.js/browser client.