Indexing | Glossary | DocLD

Indexing is the process of adding document content to a searchable vector index. After parsing and chunking, each chunk is embedded and stored in the index so vector search and RAG can retrieve relevant passages.

How Indexing Works in DocLD

Parse — Extract text and structure from the document.
Chunk — Split into segments suitable for embedding.
Embed — Convert each chunk to a vector using an embedding model.
Upsert — Write vectors and metadata (e.g. document ID, knowledge base) to the vector index (e.g. vector database).

Once indexed, the document is searchable. If the source file or chunking settings change, you may need to reindex to refresh the index.

Reindex is re-running indexing for existing documents. Vector index and vector database store the indexed vectors. Ingestion often includes parsing, chunking, and indexing together.

Frequently Asked Questions

How Indexing Works in DocLD

Parse — Extract text and structure from the document.
Chunk — Split into segments suitable for embedding.
Embed — Convert each chunk to a vector using an embedding model.
Upsert — Write vectors and metadata (e.g. document ID, knowledge base) to the vector index (e.g. vector database).

Once indexed, the document is searchable. If the source file or chunking settings change, you may need to reindex to refresh the index.

Reindex is re-running indexing for existing documents. Vector index and vector database store the indexed vectors. Ingestion often includes parsing, chunking, and indexing together.

How Indexing Works in DocLD

Related Concepts

Frequently Asked Questions

How Indexing Works in DocLD

Related Concepts

Frequently Asked Questions