AIStackWatch
Back to wiki

What is a Vector Database?

A vector database is a datastore purpose-built to index and query high-dimensional float arrays — the embeddings produced by models like OpenAI text-embedding-3-large or Cohere's embed-v3. Its core operation is approximate nearest-neighbor (ANN) search: given a query vector, return the k most similar stored vectors.

Why you need one

A naive cosine-similarity scan over a million vectors takes seconds. ANN indexes (HNSW, IVF, DiskANN) bring that to milliseconds by trading a tiny amount of recall for huge speedups. A general-purpose database can't do this natively unless you add an extension.

Vector databases also handle:

  • Metadata filters — "similar to X, but only where tenant_id = 7."
  • Hybrid search — combining vector similarity with BM25 keyword matching.
  • Writes and updates — embeddings change as documents change.
  • Replication and backups — you care about this once you have real data.

Dedicated vs. embedded

  • Dedicated services — Pinecone, Weaviate, Turbopuffer. Managed, scale past a billion vectors, pay per GB + per query.
  • Extensions on existing databasespgvector on Postgres, Redis Stack. One less service to run; lower ceiling on scale.

For most products under 10M vectors, pgvector on the Postgres you already have is the simplest choice. Past that, a dedicated store earns its keep.

What to look at when picking one

  • Recall at latency SLO. A store that hits 99% recall at 50ms p95 beats one that hits 95% at 20ms for most RAG.
  • Metadata filter quality. Post-filtering (search first, filter after) silently kills recall at low selectivity. Look for pre-filtered indexes.
  • Pricing model. Per-query, per-GB, per-pod — all legitimate; match to your traffic shape.
  • Multi-tenancy. Namespace isolation matters if you serve per-customer data.

When NOT to use a vector database

Small corpora (under 10k chunks) — keep embeddings in a Parquet or SQLite file and do brute-force cosine. No index, no infra, correct every time.