Building DocLD — How We're Building It, Why We're Here, and Where We Stand | DocLD Blog

We're building DocLD to turn documents into data and to give developers a single document intelligence platform for parsing, extraction, knowledge bases, and workflow automation. Most document intelligence tools today force you to cobble together point solutions: one vendor for OCR, another for extraction, a third for vector search, and yet another for workflows. Each integration adds latency, cost, and brittleness. DocLD flips that model. This post explains how we're building it, why we're building it, and how we're positioning ourselves in the market. See features, pricing, and about us for more.

How We're Building DocLD

DocLD is built as an end-to-end document intelligence platform, not a collection of point tools. Our philosophy is API-first: every capability—parsing, OCR, extraction, knowledge bases, chat, workflows—is available via a consistent REST API so you can compose pipelines without stitching together multiple vendors.

Product Shape and Document Lifecycle

We treat the journey as parse → extract → index → query → automate. Documents are parsed (including OCR and layout), optionally extracted into structured data, then can be added to knowledge bases for RAG-powered chat with citations. Workflows tie these steps together with triggers and built-in integrations. Analytics and organization support sit on top so teams can see usage, costs, and quality in one place.

System Architecture

DocLD follows a product-level architecture: clients (dashboard, API, CLI) interact with the platform, which processes documents and stores your data. Here's how the pieces fit together:

Layer	Components	Purpose
Clients	Web Dashboard, API Clients, CLI	User and programmatic access
DocLD Platform	Parsing, Extraction, RAG, Workflows	Document intelligence logic
Your Data	Documents, Vector Index, File Storage	Persistence, semantic search, file storage

API-First Design Philosophy

Every capability is exposed as a REST endpoint. You can parse a document, run extraction, add it to a knowledge base, or trigger a workflow—all via HTTP. Full API documentation is in our docs. Authentication uses Authorization: Bearer YOUR_API_KEY for API access, or session cookies for the dashboard. Webhooks can sign payloads for inbound integrations.

Developer Experience and Compliance

We focus on clear APIs, webhooks, SDKs, and CLI tooling. HIPAA and GDPR are built into how we handle data, retention, and consent. See our privacy policy and terms for details. Encryption is TLS 1.3 in transit and AES-256 at rest. Row-level security, API key scoping to organizations, and role-based access (owner, admin, member, viewer) give regulated teams the controls they need without retrofitting.

The Technical Pipeline

Understanding how documents move from upload to searchable content helps you reason about latency, costs, and configuration. The pipeline has six stages.

Processing Stages Overview

Upload — File is validated (type, size, quota) and stored in secure object storage.
Parse — Content extraction based on file type (PDF, images, DOCX, XLSX, PPTX).
OCR — For scanned documents and images, VLM-based OCR extracts text with bounding boxes.
Chunk — Content is split into semantic units using a configurable chunking strategy.
Embed — Chunks are converted to high-dimensional vector embeddings.
Index — Vectors are indexed for semantic search.

Chunking Strategies

Strategy	Description	Best For
Semantic	Split by meaning	Most documents
Fixed	Fixed character count	Simple text
Page	Split by page	Presentations

Chunking parameters: max_chunk_size (default 1000), overlap (default 100), and preserve_sentences to avoid mid-sentence breaks.

Processing Statuses

Documents progress through these statuses. You can poll GET /api/documents/{id} or stream updates via GET /api/documents/{id}/status (SSE).

Status	Description
`uploading`	File being uploaded
`pending`	Queued for processing
`processing`	Currently being processed
`parsing`	Text extraction
`ocr`	OCR processing
`chunking`	Content chunking
`vectorizing`	Creating embeddings
`completed`	Ready for use
`failed`	Processing error

RAG Query Flow

When you chat with a knowledge base, queries follow this path:

Queue System

Background processing uses a job queue. Triggers (upload, webhook, schedule) enqueue jobs; workers process them with automatic retries, rate limiting, and progress tracking.

Typical Processing Times

Document Type	Typical Time
PDF (10 pages)	5–15 seconds
Image	2–5 seconds
DOCX	2–10 seconds
Large PDF (100+ pages)	30–60 seconds

Why We're Building It

The Problem

Most organizations sit on huge amounts of unstructured content—PDFs, scans, forms, contracts—while the tools to use that content are either siloed (one product for OCR, another for extraction, another for search) or locked into a single vendor's ecosystem. That makes it hard to build coherent document workflows and to keep control over your data and your stack.

Our Mission

Our mission is to unlock the value hidden in unstructured documents so developers and teams can build applications that understand and process documents with human-like accuracy.

Positioning and Competitive Landscape

We don't try to be "the best at one thing." We aim to be the platform that covers the full document lifecycle so you don't have to assemble and maintain multiple services. Compare options on our comparisons page.

DocLD vs Point Solutions

Aspect	Point solutions	DocLD
Scope	One slice (e.g. extraction, PDF editing, cloud API)	Parse, extract, knowledge bases, chat, workflows, analytics in one platform
Knowledge bases and RAG chat	Often absent or add-on	Built-in; core to the product
Workflow automation	Rare or limited	Visual builder, templates, schedules, webhooks
Compliance	Varies	HIPAA and GDPR first-class; organizations, audit logging
Integration	Multiple vendors to wire together	One API, one billing relationship

End-to-End Platform

Many players excel at one slice—extraction, PDF editing, or cloud document APIs. DocLD combines parsing, extraction, knowledge bases with RAG chat, workflow automation, and analytics in a single platform. You get one integration, one billing relationship, and one place to manage documents and pipelines.

Knowledge Bases and Chat

We offer built-in knowledge bases and RAG chat with citations and session sharing. That's core to the product, not an add-on, so you can go from "upload documents" to "ask questions with sources" without standing up your own vector store and retrieval layer.

Workflow Automation and Compliance

We provide a visual workflow builder, templates (e.g. invoice, contract, resume), scheduled runs, and webhook triggers. That's differentiated from vendors that focus only on APIs or only on extraction. HIPAA and GDPR are first-class; we support organizations, teams, audit logging, and granular controls so enterprises and regulated teams can adopt DocLD with confidence.

Transparent Pricing and Analytics

Usage, cost, and quality metrics are visible in the product. We want you to understand what you're using and what it costs. See pricing, the document processing cost calculator, and automation ROI calculator.

We're not the only option for document intelligence—and we don't need to be. If that's the shape of the product you need, we're building it for you. Try our free document tools (PDF to text, Markdown, compare, OCR) or read the documentation.

Technical Deep Dive

API Surface Overview

DocLD exposes a REST API. Endpoints are grouped by capability:

Category	Key Endpoints
Upload & Parse	`POST /api/upload`, `POST /api/parse`, `POST /api/parse/async`
Documents	`GET /api/documents`, `GET /api/documents/{id}`, `GET /api/documents/{id}/status`
Extract	`POST /api/extract/run`, `POST /api/extract/batch`, `GET/POST /api/extract/schemas`
Chat	`POST /api/chat`, `GET /api/chat/sessions`
Knowledge Bases	`GET/POST /api/knowledge-bases`, `GET/POST /api/knowledge-bases/{id}/documents`
Workflows	`GET/POST /api/workflows`, `POST /api/workflows/{id}/execute`
Additional	Split, Generate, Edit, Analytics, Organizations, API Keys, GDPR, Jobs

Parse Example

Parse a document synchronously by file upload:

curl -X POST "https://your-domain.com/api/parse" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@document.pdf"

Or from a URL or DocLD reference:

curl -X POST "https://your-domain.com/api/parse" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"input": "https://example.com/document.pdf"}'

Extraction and Schemas

Extraction uses JSON schemas to define fields. You specify field names, types (string, number, boolean, date, array, object), and optional instructions. The AI extracts values and returns confidence scores and citations.

Schema example:

{
  "name": "Invoice",
  "description": "Extract invoice data",
  "fields": [
    { "name": "invoice_number", "type": "string", "required": true },
    { "name": "total_amount", "type": "number", "required": true },
    { "name": "line_items", "type": "array", "items": { "type": "object" } }
  ]
}

Run extraction:

curl -X POST "/api/extract/run" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "document_id": "doc-uuid",
    "schema_id": "invoice-schema",
    "include_citations": true
  }'

Prebuilt schemas exist for invoices, contracts, resumes, and common forms. You can also generate schemas from a sample document or a natural language description.

Workflow Triggers and Step Types

Workflows support four trigger types:

Trigger	Use Case
Event	`document.uploaded`, `document.processed`, `extraction.completed`
Scheduled	Cron (e.g. `0 9 * * 1-5` for weekdays at 9 AM)
Webhook	External systems invoke via HTTP
Manual	Run on-demand via `POST /api/workflows/{id}/execute`

Step types include parse, extract, condition (branching), transform, and integration (Slack, email, webhook). Integrations use template variables like {{document.name}}, {{extraction.invoice_number}}, and {{trigger.timestamp}}.

Workflow definition example:

{
  "name": "Invoice Auto-Processing",
  "trigger_type": "event",
  "trigger_config": {
    "event": "document.uploaded",
    "filters": { "file_type": "pdf" }
  },
  "definition": {
    "steps": [
      { "id": "extract", "type": "extract", "config": { "schema_id": "invoice-schema" } },
      { "id": "notify", "type": "integration", "config": { "integration_id": "slack-uuid", "template": "New invoice {{invoice_number}}" } }
    ]
  }
}

Frequently Asked Questions

DocLD is an end-to-end document intelligence platform for developers and teams who need to parse, extract, chat, and automate over documents. It's built for teams that want one API and one platform instead of stitching together multiple vendors. See features and pricing.

Use Case	Example
Invoice processing	Extract vendor, amount, line items from PDFs
Contract analysis	Review clauses, extract parties and dates
Resume parsing	Parse CVs into structured candidate data
Claims extraction	Pull key fields from insurance claims
RAG document Q&A	Chat with documents and get cited answers

Documents go through six stages. Track progress via GET /api/documents/{id} or stream status with GET /api/documents/{id}/status (SSE).

Upload — Validate and store
Parse — Extract text by file type
OCR — For scans and images
Chunk — Semantic, fixed, or page-based
Embed — Vector embeddings
Index — Semantic search

Supported formats:

Format	Supported
PDF	Yes
Images	Yes
DOCX	Yes
XLSX, CSV	Yes
PPTX	Yes

Chunking strategies:

Strategy	Description	Default
Semantic	Split by meaning	Yes
Fixed	Character count
Page	By page

Tune max_chunk_size and overlap in the parsing config.

Define a schema — Field names, types (string, number, boolean, date, array, object), and optional instructions.
Call the API — POST /api/extract/run with document ID and schema ID.
Get results — Extracted values, confidence scores, and citations.

Minimal schema example:

{
  "fields": [
    { "name": "invoice_number", "type": "string", "required": true },
    { "name": "total_amount", "type": "number", "required": true }
  ]
}

Prebuilt schema categories: Invoice, Contract, Resume, Financial, Form (W-9, W-4, 1099). Generate schemas from a sample document or natural language description.

	Sync	Async
Endpoint	`POST /api/parse`	`POST /api/parse/async`
When to use	Smaller docs, need results now	Large docs, non-blocking
Response	Immediate JSON	Webhook when complete

API sequence:

Create knowledge base — POST /api/knowledge-bases
Add documents — POST /api/knowledge-bases/{id}/documents
Chat — POST /api/chat with knowledge_base_id

DocLD handles chunking, embedding, and indexing; you get semantic search and citations without managing a vector store.

Triggers:

Trigger	Events / Config
Event	`document.uploaded`, `document.processed`, `extraction.completed`
Scheduled	Cron (e.g. `0 9 * * 1-5` for weekdays)
Webhook	External HTTP invocation
Manual	`POST /api/workflows/{id}/execute`

Step types: parse, extract, condition (branching), transform, integration.

Integrations: Slack, email, generic webhook. Use template variables ({{document.name}}, {{extraction.invoice_number}}) for dynamic content.

Aspect	Point solutions	DocLD
Scope	One slice (extraction, PDF editing, etc.)	Full lifecycle in one platform
Integration	Multiple vendors to wire	One API, one billing
Compliance	Varies	HIPAA and GDPR first-class

DocLD covers parse, extract, knowledge bases, RAG chat, workflows, and analytics. We're not the best at any single capability—we're the option when you want everything integrated.