Document Splitting
Split large documents into smaller, focused sections.
What is Splitting?
Document splitting divides large documents into logical sections:
- By page ranges (manual)
- By AI-detected boundaries (automatic)
- By structure (headings, chapters)
Use cases:
- Split large reports into chapters
- Separate multi-page forms
- Extract specific sections for processing
- Organize long documents
Splitting Documents
From the Dashboard
- Go to Split in the sidebar
- Select a document
- Choose splitting method:
- Manual: Define page ranges
- AI: Let DocLD detect sections
- Review suggested splits
- Apply
AI-Powered Splitting
DocLD can automatically detect logical section boundaries:
- Upload or select a document
- Click Analyze
- Review detected sections
- Adjust if needed
- Click Split
AI detection looks for:
- Chapter headings
- Section titles
- Page breaks
- Table of contents entries
- Structural patterns
Splitting Methods
Manual Split
Define exact page ranges:
{
"method": "manual",
"sections": [
{
"name": "Introduction",
"page_start": 1,
"page_end": 5
},
{
"name": "Chapter 1",
"page_start": 6,
"page_end": 20
},
{
"name": "Appendix",
"page_start": 21,
"page_end": 30
}
]
}AI Split
Let AI detect boundaries:
{
"method": "ai",
"instructions": "Split by chapter headings and major sections"
}AI returns suggested sections with confidence scores.
Page-Based Split
Split by fixed page counts:
{
"method": "page",
"pages_per_section": 10
}Split Configurations
Save reusable split configurations:
Creating a Config
{
"name": "Legal Document Splitter",
"description": "Split legal documents by article",
"instructions": "Detect Article and Section headings",
"settings": {
"method": "ai",
"min_section_pages": 2,
"preserve_headers": true
}
}Using a Config
Apply a saved configuration:
- Select a document
- Click Apply Config
- Choose your configuration
- Review and apply
Saved configs appear in the Configurations tab on the Split page. You can also save the current configuration from that tab for reuse. Via the API, saved configs and inline config can include advanced options (e.g. settings.splitMode, settings.minSectionPages); see the Split API for the full shape.
Section Types
| Type | Description |
|---|---|
detected | AI-detected section |
custom | User-defined section |
page | Page-based split |
document | Entire document |
After Splitting
Each section becomes a new document that you can:
- Extract - Run extraction on specific sections
- Chat - Ask questions about specific sections
- Organize - Add sections to different knowledge bases
- Export - Download individual sections
Section Relationships
Sections maintain a relationship to the parent:
{
"id": "section-uuid",
"parent_document_id": "original-uuid",
"section_name": "Chapter 1",
"page_start": 6,
"page_end": 20
}Workflow Integration
Add splitting to workflows:
{
"type": "split",
"config": {
"method": "ai",
"instructions": "Split by section headings"
}
}This enables automatic document splitting as part of processing pipelines.
Best Practices
- Analyze first - Use AI analysis before splitting
- Review sections - Verify detected boundaries
- Name clearly - Use descriptive section names
- Save configs - Reuse configurations for similar documents
- Process sections - Extract from specific sections for accuracy
API Reference
See the Split API for programmatic access.