Similarity Score
AISimilarity score is a numeric measure of how closely two embedding vectors match. Higher scores indicate greater semantic similarity. Vector search uses similarity scores to rank chunks by relevance to a query.
Common Metrics
| Metric | Description |
|---|---|
| Cosine similarity | Angle between vectors; range typically 0–1 or -1 to 1 |
| Dot product | Magnitude-aware; often used with normalized vectors |
| Euclidean distance | Lower = more similar; inverse of similarity |
DocLD uses Pinecone for vector search. Retrieved chunks are ranked by similarity; top-k controls how many are returned. Reranking can further refine order.
Related Concepts
Similarity scores power vector search and RAG retrieval. Cosine similarity is a common metric. Confidence scores measure extraction reliability, not vector similarity.