Chunk documents for search and LLM context.
Split turns parsed content into semantic chunks that respect paragraphs, headings, and tables. Configurable size and overlap tune RAG retrieval and context windows. Index chunks in your knowledge base for semantic search and chat — batch or real-time.



Split
Chunk documents for search and LLM context.
Split turns parsed content into semantic chunks that respect paragraphs, headings, and tables. Chunk sizes and overlap are configurable so you can tune for RAG retrieval quality and context window limits.
Chunks are vectorized and indexed in your knowledge base for semantic search and chat. Use the same pipeline for batch processing or real-time uploads.
Splitting methods
Manual, AI, or page-based — choose how to divide documents.
Use manual split to define exact page ranges and section names. Use AI split to let DocLD detect chapter headings, section titles, and structural patterns. Use page-based split for fixed pages per section (e.g. 10 pages each).
Each method produces section documents linked to the parent; you can then extract, add to knowledge bases, or trigger workflows on each section.






AI-powered splitting
Analyze first, review sections, then split.
Run analyze before splitting to get suggested section boundaries. Provide instructions (e.g. "Split by chapter headings and major sections") and optionally set confidence threshold and precision mode (accurate vs fast).
AI detection looks for chapter headings, section titles, page breaks, table of contents entries, and structural patterns. Review and adjust suggested sections before applying.
Split configurations
Save and reuse split settings.
Save reusable split configurations with name, description, instructions, and settings (method, min/max section pages, preserve headers, etc.). Reference a config by ID in run or batch requests so you don't repeat options on every call.
Configs can be user-scoped or organization-scoped. Use GET/POST/PATCH/DELETE on /api/split/configs to manage them.






API & post-split actions
Sync run, async with webhooks, batch; optional post-split steps.
Use synchronous split for immediate results or async for webhook callbacks. Batch split processes multiple documents with the same config. After splitting, optionally add sections to a knowledge base, run an extraction schema on each section, or trigger a workflow.
Post-split actions run automatically when you pass post_split_knowledge_base_id, post_split_extract_schema_id, or post_split_workflow_id in the request.
Workflow integration
Split step in pipelines; section types for downstream steps.
Add a split step to workflows so documents are split automatically as part of a pipeline. Each section becomes a child document with a section type (detected, custom, page, or document) so downstream steps can route or process by type.
Use routing rules to run extract or workflow only on sections that match a classification. Combine with post-split actions for end-to-end automation.



Splitting methods
| Method | Description |
|---|---|
| Manual | Define exact page ranges and section names. |
| AI | Let DocLD detect chapter headings, sections, and structural patterns. |
| Page | Split by fixed page count per section (e.g. 10 pages each). |
| Document | Treat the entire document as one section. |
Section types
| Type | Description |
|---|---|
| detected | AI-detected section. |
| custom | User-defined section (manual split). |
| page | Page-based split. |
| document | Entire document as one section. |
Split API
| Endpoint / usage | Description |
|---|---|
| POST /api/split/run | Split a document; sync or async with optional webhook. |
| POST /api/split/batch | Split multiple documents with the same config. |
| POST /api/split/analyze | Analyze document and return suggested section boundaries. |
| GET/POST/PATCH/DELETE /api/split/configs | List, create, update, and delete split configurations. |
| GET /api/split/export | Export split sections as ZIP or presigned download URLs. |
Post-split options
| Option | Description |
|---|---|
| Add to knowledge base | Add each section document to a knowledge base for search and chat. |
| Run extraction schema | Run an extraction schema on each section document. |
| Trigger workflow | Start a workflow for each section (e.g. extract then notify). |
| Webhook | Receive a callback when the split job completes (async only). |
Credits and limits
| Aspect | Details |
|---|---|
| Credits per split | 1.5 per completed split job |
| Rate limit | Split creation (run, batch, analyze) is rate-limited per user; retry later if exceeded. |
Split: Questions & Answers
Ready to split your documents?
Get started with the Split API in minutes. Sign up for free or read the full guide for methods, configs, and post-split actions.