Embedding
AIEmbedding is a numerical vector representation of text that captures its semantic meaning. Similar concepts produce similar vectors, which enables semantic search: finding documents or passages that match a query by meaning rather than keywords.
How Embeddings Work in DocLD
DocLD uses Pinecone integrated embeddings with the llama-text-embed-v2 model. When documents are processed:
- Content is chunked into semantic segments
- Each chunk is sent to Pinecone for embedding generation
- Embeddings are stored in the vector index alongside metadata
- Query text is also embedded using the same model for vector search
Embedding happens server-side—you don't call a separate embedding API. Pinecone handles both embedding generation and storage.
Why Embeddings Matter
| Use Case | How embeddings help |
|---|---|
| RAG chat | Find relevant chunks to answer questions accurately |
| Semantic search | Match queries by meaning, not exact words |
| Document retrieval | Return documents similar to a reference |
Related Concepts
Embeddings power RAG and vector search. Quality depends on chunking strategy and the underlying model. DocLD stores embeddings in Pinecone.