Embedding Model
AIEmbedding model is the AI model that converts text into numerical vectors for embedding and vector search. The same model must be used for documents and queries so that similarity comparisons are meaningful.
How DocLD Uses Embedding Models
DocLD uses Pinecone integrated embeddings with llama-text-embed-v2. Content is chunked, sent to Pinecone for embedding generation, and stored in the vector index. Queries are embedded with the same model for vector search and RAG.
Why Model Choice Matters
- Dimensionality — Each model produces vectors of a fixed size (e.g., 768 or 1024 dimensions)
- Language — Some models handle multilingual text better
- Domain — General-purpose vs. domain-specific models affect retrieval quality
Related Concepts
Embedding models power vector search and RAG. Pinecone manages embedding generation in DocLD. Chunking determines what text gets embedded.