Document Pipeline
ProcessingA document pipeline is a defined sequence of processing steps applied to documents. In DocLD, a typical pipeline is: upload → parse → chunk → embed → index (for RAG) or upload → parse → extract (for structured data). Workflows let you define pipelines with triggers and steps.
Pipeline vs Workflow
| Concept | Description |
|---|---|
| Pipeline | The logical flow of steps (parse, chunk, extract, etc.) |
| Workflow | The automated definition with triggers, steps, and integrations |
A workflow implements a document pipeline by specifying when it runs (e.g., on upload or webhook) and which steps execute. Each workflow run is one execution of the pipeline.
Related Concepts
Document pipelines are realized through document processing and workflows. Jobs track each document through the pipeline; batch processing runs the same pipeline on many documents.