Document
ConceptsA document in DocLD is a file or unit of content that you upload or ingest for processing. Documents can be PDFs, images, spreadsheets, presentations, or other supported file formats. Each document gets a unique document ID and can be parsed, chunked, indexed in a knowledge base, and used for extraction or RAG chat.
Document Lifecycle
- Upload — Add the file via document upload (dashboard or API).
- Parse — DocLD extracts text, layout, and tables (parsing, OCR if needed).
- Chunk & embed — Content is split into chunks and embedded for search.
- Use — The document appears in vector search and citation results; you can also run extraction against it.
Documents can be associated with one or more knowledge bases. When cited in chat or extraction, they appear as source documents with page and excerpt.
Related Concepts
Source document refers to the document that supplied a specific passage or value. Document processing and document pipeline describe how documents move through parsing and indexing.