Python SDK

The official DocLD SDK for Python. Published as docld on PyPI. Requires Python 3.9+ and uses requests for HTTP.

Installation


pip install docld

Configuration

Create a client with your API key and optional base URL:


from docld import DocLD
 
client = DocLD(
    api_key="docld_...",
    base_url="https://api.docld.com",  # optional
)

Argument	Description
`api_key`	Your DocLD API key. Create one in Settings → API Keys. Can be omitted if `DOCLD_API_KEY` is set.
`base_url`	API base URL. Defaults to `https://api.docld.com` or the `DOCLD_BASE_URL` environment variable (for self-hosted or custom deployments).

Quick start


from docld import DocLD
 
client = DocLD(api_key="docld_...")
 
# Parse a document from a URL
result = client.parse.parse("https://example.com/document.pdf")
print(result["result"]["chunks"])
 
# List documents
resp = client.documents.list(limit=10)
print(resp["documents"], resp["total"])

API reference

Upload

client.upload.upload(file, parsing_config=None, knowledge_base_id=None, organization_id=None, classification=None)

Upload a document. Returns a dict with upload metadata and document record.

file — One of:

Path string — Path to a file on disk (e.g. "document.pdf").
bytes — Raw file bytes.
File-like object — Any object with a .read() method (and optionally .name).

Keyword arguments:

Argument	Type	Description
`parsing_config`	`dict \| None`	Parsing options (e.g. enhance, settings).
`knowledge_base_id`	`str \| None`	Add the document to this knowledge base.
`organization_id`	`str \| None`	Scope to this organization.
`classification`	`str \| None`	One of: `internal`, `confidential`, `pii`, `restricted`, `public`.

Returns: dict with:

file_id — e.g. docld://<documentId> for use in parse/extract.
document — { id, name, status, file_type, file_format, parsing_config? }.
filename, size, mime_type.
queue — { queued, error }.
usage — { num_pages, credits }.

Parse

client.parse.parse(input, config=None)

Parse a document. Input can be a URL, a docld:// reference, a file path, bytes, or a file-like object.

input — One of:

URL string — https://..., http://..., or docld://<documentId>.
Path string — Path to a local file.
bytes — Raw file content.
File-like object — Object with .read() (and optionally .name).

config (optional) — dict with parsing options, e.g.:

enhance — AI enhancement options.
settings — e.g. change_tracking.
formatting — e.g. include_hyperlinks, include_comments, include_highlights.

Returns: dict with:

job_id, duration, usage (num_pages, credits).
result — { type: 'full', chunks: [...] } (each chunk has content, blocks, optional metadata).
studio_link — Link to view in DocLD Studio.

Documents

client.documents.list(search=None, status=None, file_type=None, file_format=None, knowledge_base_id=None, tags=None, date_from=None, date_to=None, sort_by=None, sort_order=None, limit=None, offset=None)

List documents with optional filters and pagination. All arguments are optional keyword arguments.

Argument	Type	Description
`search`	`str \| None`	Filter by name or content.
`status`	`str \| list[str] \| None`	Filter by status.
`file_type`	`str \| list[str] \| None`	Filter by file type.
`file_format`	`str \| list[str] \| None`	Filter by format.
`knowledge_base_id`	`str \| None`	Only documents in this knowledge base.
`tags`	`list[str] \| None`	Filter by tags.
`date_from`	`str \| None`	Created on or after (ISO date).
`date_to`	`str \| None`	Created on or before (ISO date).
`sort_by`	`str \| None`	One of: `name`, `created_at`, `file_size`, `status`, `updated_at`.
`sort_order`	`str \| None`	`asc` or `desc`.
`limit`	`int \| None`	Page size (default 50, max 100).
`offset`	`int \| None`	Pagination offset.

Returns: dict with documents, total, limit, offset, has_more.

client.documents.get(id) — Get a single document by ID. Returns a document dict (includes file_url as signed URL when applicable).

client.documents.delete(id) — Delete a document. Returns None.

client.documents.get_parse(id) — Get parsed content for a document. Returns parsed result object.

client.documents.get_status(id) — Get processing status for a document. Returns status object.

Extract

client.extract.run(input=None, document_id=None, schema_id=None, config=None, description=None, include_citations=False, variant_label=None, experiment_id=None)

Run extraction on a document. Provide exactly one of: input or document_id, and one of: schema_id, config, or description.

Keyword arguments:

Argument	Type	Description
`input`	`str \| None`	Document reference: URL, `docld://<id>`, or `jobid://<jobId>`.
`document_id`	`str \| None`	Legacy: document ID.
`schema_id`	`str \| None`	ID of a saved schema (from `schemas.list()` or `schemas.get()`).
`config`	`dict \| None`	Inline extraction config (fields, instructions).
`description`	`str \| None`	Natural language description for zero-shot extraction.
`include_citations`	`bool`	Include per-field citations.
`variant_label`	`str \| None`	Experiment variant (when using experiments).
`experiment_id`	`str \| None`	Run within an experiment.

Returns: dict with:

success, job_id, extraction_id?, document_id.
data — Extracted key-value object (or null).
field_results? — Per-field value, confidence, and optional citations.
overall_confidence?, processing_time?, usage?, error?.

client.extract.schemas.list(organization=False, organization_only=False, organization_id=None) — List extraction schemas. Returns dict with schemas list.

client.extract.schemas.get(id) — Get a schema by ID. Returns schema dict.

client.extract.schemas.create(name, description='', instructions='Be precise and thorough.', fields=None, settings=None, ground_truth_template=None, organization_id=None, is_shared=False, version=None) — Create a schema. Returns schema dict. fields defaults to []; settings defaults to {"includeImages": False, "visionScope": "all"}.

Errors

Failed requests raise DocLDError (subclass of Exception) with:

message — Human-readable message.
code — Optional error code (e.g. NOT_FOUND, VALIDATION_ERROR, UNAUTHORIZED).
details — Optional dict with extra context.
errors — Optional list of { field, message } for validation errors.


from docld import DocLD, DocLDError
 
try:
    client.documents.get("invalid-id")
except DocLDError as e:
    print(e.code, e.message, e.details)
    raise

Python SDK

The official DocLD SDK for Python. Published as docld on PyPI. Requires Python 3.9+ and uses requests for HTTP.

Installation


pip install docld

Configuration

Create a client with your API key and optional base URL:


from docld import DocLD
 
client = DocLD(
    api_key="docld_...",
    base_url="https://api.docld.com",  # optional
)

Argument	Description
`api_key`	Your DocLD API key. Create one in Settings → API Keys. Can be omitted if `DOCLD_API_KEY` is set.
`base_url`	API base URL. Defaults to `https://api.docld.com` or the `DOCLD_BASE_URL` environment variable (for self-hosted or custom deployments).

Quick start


from docld import DocLD
 
client = DocLD(api_key="docld_...")
 
# Parse a document from a URL
result = client.parse.parse("https://example.com/document.pdf")
print(result["result"]["chunks"])
 
# List documents
resp = client.documents.list(limit=10)
print(resp["documents"], resp["total"])

API reference

Upload

client.upload.upload(file, parsing_config=None, knowledge_base_id=None, organization_id=None, classification=None)

Upload a document. Returns a dict with upload metadata and document record.

file — One of:

Path string — Path to a file on disk (e.g. "document.pdf").
bytes — Raw file bytes.
File-like object — Any object with a .read() method (and optionally .name).

Keyword arguments:

Argument	Type	Description
`parsing_config`	`dict \| None`	Parsing options (e.g. enhance, settings).
`knowledge_base_id`	`str \| None`	Add the document to this knowledge base.
`organization_id`	`str \| None`	Scope to this organization.
`classification`	`str \| None`	One of: `internal`, `confidential`, `pii`, `restricted`, `public`.

Returns: dict with:

file_id — e.g. docld://<documentId> for use in parse/extract.
document — { id, name, status, file_type, file_format, parsing_config? }.
filename, size, mime_type.
queue — { queued, error }.
usage — { num_pages, credits }.

Parse

client.parse.parse(input, config=None)

Parse a document. Input can be a URL, a docld:// reference, a file path, bytes, or a file-like object.

input — One of:

URL string — https://..., http://..., or docld://<documentId>.
Path string — Path to a local file.
bytes — Raw file content.
File-like object — Object with .read() (and optionally .name).

config (optional) — dict with parsing options, e.g.:

enhance — AI enhancement options.
settings — e.g. change_tracking.
formatting — e.g. include_hyperlinks, include_comments, include_highlights.

Returns: dict with:

job_id, duration, usage (num_pages, credits).
result — { type: 'full', chunks: [...] } (each chunk has content, blocks, optional metadata).
studio_link — Link to view in DocLD Studio.

Documents

List documents with optional filters and pagination. All arguments are optional keyword arguments.

Argument	Type	Description
`search`	`str \| None`	Filter by name or content.
`status`	`str \| list[str] \| None`	Filter by status.
`file_type`	`str \| list[str] \| None`	Filter by file type.
`file_format`	`str \| list[str] \| None`	Filter by format.
`knowledge_base_id`	`str \| None`	Only documents in this knowledge base.
`tags`	`list[str] \| None`	Filter by tags.
`date_from`	`str \| None`	Created on or after (ISO date).
`date_to`	`str \| None`	Created on or before (ISO date).
`sort_by`	`str \| None`	One of: `name`, `created_at`, `file_size`, `status`, `updated_at`.
`sort_order`	`str \| None`	`asc` or `desc`.
`limit`	`int \| None`	Page size (default 50, max 100).
`offset`	`int \| None`	Pagination offset.

Returns: dict with documents, total, limit, offset, has_more.

client.documents.get(id) — Get a single document by ID. Returns a document dict (includes file_url as signed URL when applicable).

client.documents.delete(id) — Delete a document. Returns None.

client.documents.get_parse(id) — Get parsed content for a document. Returns parsed result object.

client.documents.get_status(id) — Get processing status for a document. Returns status object.

Extract

client.extract.run(input=None, document_id=None, schema_id=None, config=None, description=None, include_citations=False, variant_label=None, experiment_id=None)

Run extraction on a document. Provide exactly one of: input or document_id, and one of: schema_id, config, or description.

Keyword arguments:

Argument	Type	Description
`input`	`str \| None`	Document reference: URL, `docld://<id>`, or `jobid://<jobId>`.
`document_id`	`str \| None`	Legacy: document ID.
`schema_id`	`str \| None`	ID of a saved schema (from `schemas.list()` or `schemas.get()`).
`config`	`dict \| None`	Inline extraction config (fields, instructions).
`description`	`str \| None`	Natural language description for zero-shot extraction.
`include_citations`	`bool`	Include per-field citations.
`variant_label`	`str \| None`	Experiment variant (when using experiments).
`experiment_id`	`str \| None`	Run within an experiment.

Returns: dict with:

success, job_id, extraction_id?, document_id.
data — Extracted key-value object (or null).
field_results? — Per-field value, confidence, and optional citations.
overall_confidence?, processing_time?, usage?, error?.

client.extract.schemas.list(organization=False, organization_only=False, organization_id=None) — List extraction schemas. Returns dict with schemas list.

client.extract.schemas.get(id) — Get a schema by ID. Returns schema dict.

Errors

Failed requests raise DocLDError (subclass of Exception) with:

message — Human-readable message.
code — Optional error code (e.g. NOT_FOUND, VALIDATION_ERROR, UNAUTHORIZED).
details — Optional dict with extra context.
errors — Optional list of { field, message } for validation errors.


from docld import DocLD, DocLDError
 
try:
    client.documents.get("invalid-id")
except DocLDError as e:
    print(e.code, e.message, e.details)
    raise

Python SDK

Installation

Configuration

Quick start

API reference

Upload

Parse

Documents

Extract

Errors

See also

Python SDK

Installation

Configuration

Quick start

API reference

Upload

Parse

Documents

Extract

Errors

See also