Vector Databases

Store and Search Embeddings at Scale

Understanding Vector Databases

Vector Database Architecture & Workflow

Vector Database Pipeline Input Data • Text • Images • Audio • Documents Embedding Model OpenAI Ada Cohere Sentence-BERT → [0.12, -0.45, ...] Vector Index HNSW IVF LSH Annoy Search Methods • Cosine Similarity • Euclidean • Dot Product • Hybrid Search Applications • RAG Systems • Semantic Search • Recommendations • Similarity Match • Anomaly Detection • Clustering Popular Vector Databases Pinecone • Fully Managed • Serverless • Auto-scaling • REST API Best for: SaaS Weaviate • Open Source • GraphQL API • Hybrid Search • Multi-tenancy Best for: Enterprise ChromaDB • Lightweight • In-memory • Python-first • Easy Setup Best for: Prototypes Qdrant • Rust-based • High Performance • Filtering • Cloud/Self-host Best for: Production Key Metrics Latency <100ms Recall 95-99% Scale Billions Cost $0.1/GB/mo

🔍 What are Vector Databases?

Vector databases are specialized storage systems designed to efficiently store, index, and search high-dimensional vectors (embeddings). Unlike traditional databases that work with structured data and exact matches, vector databases excel at similarity search - finding items that are semantically similar based on their vector representations.

Key Concepts:
Embeddings: Dense numerical representations of data (text, images, audio) in high-dimensional space
Similarity Search: Finding nearest neighbors based on distance metrics (cosine, Euclidean)
Indexing: Specialized algorithms (HNSW, IVF, LSH) for fast approximate nearest neighbor search
Hybrid Search: Combining vector similarity with metadata filtering

Vector Database Comparison

Database Type Best For Pros Cons Pricing
Pinecone Managed Cloud Production SaaS Fully managed, Auto-scaling, Simple API Vendor lock-in, Cost at scale $70/mo starter
Weaviate Open Source Enterprise Hybrid search, GraphQL, Multi-modal Complex setup, Resource intensive Free / Cloud option
ChromaDB Open Source Prototypes Simple, Python-native, Lightweight Limited scale, Basic features Free
Qdrant Open Source High Performance Fast, Rust-based, Advanced filtering Newer ecosystem Free / Cloud option
Milvus Open Source Large Scale Billion-scale, GPU support Complex operations Free / Zilliz Cloud
pgvector PostgreSQL Extension Existing PG users SQL integration, Familiar Limited vector features Free

Implementation Examples

Pinecone - Quick Start

import pinecone from openai import OpenAI # Initialize pinecone.init(api_key="YOUR_API_KEY") index = pinecone.Index("my-index") # Create embeddings client = OpenAI() response = client.embeddings.create( model="text-embedding-ada-002", input="Sample text to embed" ) embedding = response.data[0].embedding # Upsert vectors index.upsert(vectors=[ ("vec1", embedding, {"text": "Sample text"}) ]) # Query similar vectors results = index.query( vector=embedding, top_k=5, include_metadata=True )

ChromaDB - Local Development

import chromadb from chromadb.utils import embedding_functions # Create client client = chromadb.Client() # Create collection collection = client.create_collection( name="my_collection", embedding_function=embedding_functions.OpenAIEmbeddingFunction( api_key="YOUR_API_KEY" ) ) # Add documents collection.add( documents=["This is a document", "This is another"], metadatas=[{"source": "doc1"}, {"source": "doc2"}], ids=["id1", "id2"] ) # Query results = collection.query( query_texts=["Find similar documents"], n_results=5 )

Weaviate - Hybrid Search

import weaviate # Connect to Weaviate client = weaviate.Client( url="http://localhost:8080", additional_headers={ "X-OpenAI-Api-Key": "YOUR_API_KEY" } ) # Create schema schema = { "class": "Article", "vectorizer": "text2vec-openai", "properties": [ {"name": "title", "dataType": ["string"]}, {"name": "content", "dataType": ["text"]} ] } client.schema.create_class(schema) # Hybrid search (vector + keyword) result = client.query.get("Article", ["title", "content"]) \ .with_hybrid(query="machine learning", alpha=0.5) \ .with_limit(5) \ .do()

Best Practices

Vector Database Best Practices

  • Choose the Right Embedding Model: Match model to your data type and use case
  • Optimize Index Type: HNSW for accuracy, IVF for speed, LSH for memory efficiency
  • Normalize Vectors: Use unit vectors for cosine similarity
  • Batch Operations: Insert and query in batches for better performance
  • Metadata Filtering: Combine vector search with metadata for precise results
  • Monitor Metrics: Track latency, recall, and resource usage
  • Version Embeddings: Track embedding model versions for reproducibility
  • Implement Caching: Cache frequent queries to reduce latency

Use Cases & Applications

Real-World Applications

  • RAG (Retrieval Augmented Generation): Enhance LLMs with relevant context
  • Semantic Search: Find content by meaning, not just keywords
  • Recommendation Systems: Find similar products, content, or users
  • Anomaly Detection: Identify outliers in high-dimensional space
  • Duplicate Detection: Find similar or duplicate content
  • Question Answering: Match questions to relevant answers
  • Image Similarity: Find visually similar images
  • Customer Support: Route tickets to similar resolved issues

Popular Vector Database Solutions

🔷 Pinecone

Meaning: Managed, scalable vector DB - the "Firebase" of vector databases.
Example: E-commerce site uses Pinecone so customers can search "red running shoes" and find similar sneakers, not just keyword matches.

Key Features:

  • Fully managed (no infrastructure to maintain)
  • Real-time indexing
  • Hybrid search (vectors + metadata filtering)
  • Auto-scaling based on usage

🔮 Weaviate

Meaning: Open-source vector DB with modular extensions (image search, question answering).
Example: HR tool uses Weaviate to let recruiters search "Python developers with fintech experience" across resumes.

Key Features:

  • GraphQL-like query language
  • Built-in ML models
  • Multi-modal search (text + images)
  • Automatic schema generation

🎨 Chroma

Meaning: Lightweight, developer-friendly vector DB (popular in prototypes).
Example: Startup builds a quick chatbot that answers from company documents using Chroma.
import chromadb

# Create a client and collection
client = chromadb.Client()
collection = client.create_collection("docs")

# Add documents with embeddings
collection.add(
    documents=["AI is transforming healthcare"],
    ids=["1"]
)

# Query for similar documents
results = collection.query(
    query_texts=["healthcare AI"],
    n_results=1
)
print(results)

Why Developers Love It:

  • Simple API
  • Runs locally or in-memory
  • Perfect for RAG prototypes
  • Minimal setup required

🚀 Milvus

Meaning: High-performance vector DB for large-scale AI apps.
Example: Video platform uses Milvus to enable similarity search → "find videos like this one."

Enterprise Features:

  • Billion-scale vector support
  • GPU acceleration
  • Multiple index types (IVF, HNSW, etc.)
  • Distributed architecture

Choosing the Right Vector Database

Decision Matrix

🔷 Choose Pinecone if:

  • You want zero infrastructure management
  • Need production-ready from day one
  • Budget for managed services

🔮 Choose Weaviate if:

  • Need multi-modal search capabilities
  • Want built-in ML models
  • Prefer open-source with enterprise features

🎨 Choose Chroma if:

  • Building a prototype or POC
  • Want minimal setup complexity
  • Need local development environment

🚀 Choose Milvus if:

  • Handling billions of vectors
  • Need maximum performance
  • Have dedicated infrastructure team