22 May 2026

What is a Vector Database?

Traditional databases find data by exact values. Vector databases find data by *similarity*, enabling AI apps to search by meaning, context, and concept. Here's how they work.

If you've been following the AI boom, you've probably heard the term "vector database" thrown around. Semantic search, retrieval-augmented generation (RAG), recommendation engines, image search, they all rely on vector databases under the hood. But what exactly is one, and why do we need a whole new category of database for AI?

The Problem with Traditional Databases

Traditional databases: SQL or otherwise, are built around exact matching. You query for rows where status = 'active' or name LIKE 'John%'. They're extraordinarily good at that. But they fall apart when the question becomes fuzzy: find me documents that are about roughly the same thing as this sentence.

There's no SQL clause for "similar meaning."

What is a Vector, Anyway?

A vector is just a list of numbers, coordinates in a high-dimensional space. The key insight behind modern AI is that you can represent any piece of data (a sentence, an image, a product) as a vector in such a way that similar things end up close together in that space.

This is done by a model called an embedding model. Feed it a sentence, get back a vector of, say, 768 numbers. Feed it a semantically similar sentence, and you get a very similar vector. Feed it something completely unrelated, and the vectors are far apart.

For example:

"The cat sat on the mat" → [0.12, -0.87, 0.45, ...]
"A kitten rested on the rug" → [0.11, -0.84, 0.47, ...] ← nearby
"The GDP of France grew by 2%" → [-0.73, 0.21, -0.62, ...] ← far away

What a Vector Database Does

A vector database stores these embedding vectors alongside your original data, and its core job is to answer one question efficiently: given a query vector, which stored vectors are closest to it?

This is called Approximate Nearest Neighbor (ANN) search, and it's non-trivial. A naive approach, comparing your query against every single stored vector, works fine for thousands of records but becomes impossibly slow at millions or billions. Vector databases use specialized data structures (like HNSW: Hierarchical Navigable Small World graphs, or IVF indexes) to make this search fast even at scale.

A typical query flow looks like this:

You have a question: "How do I reset my password?"
Your app embeds it into a vector using an embedding model
The vector database finds the top-K most similar vectors in your collection
You retrieve the original documents those vectors represent
You pass those documents to an LLM as context

This last pattern is the backbone of RAG (Retrieval-Augmented Generation), the technique that lets LLMs answer questions about your data without retraining.

Vector Search vs. Full-Text Search

It's worth distinguishing vector search from full-text search (FTS), since they're often confused:

Full-Text Search

Matches exact keywords ("car" → "cars")
Very fast, minimal setup
Language-specific, misses synonyms and related concepts

Vector Search

Matches by meaning ("car" → "automobile", "vehicle")
Fast at scale with ANN indexing
Multilingual-friendly; requires an embedding model

Neither is strictly better. Hybrid search: combining both, is often the most robust approach for production systems.

Do You Always Need a Dedicated Vector Database?

Not necessarily. The landscape breaks down roughly like this:

Pure vector databases (Pinecone, Weaviate, Qdrant, Chroma): built from the ground up for vector workloads, rich filtering, cloud-native scaling.
Vector extensions (pgvector for PostgreSQL, SQLite-vec): add vector search to a database you already run great if you don't want another system.
Embedded/local (HNSW libraries, FAISS): run entirely in-process, no server needed, ideal for desktop apps or small-scale use.

For most early-stage projects, an extension or embedded library is sufficient. You reach for a dedicated vector database when you need cloud scale, real-time updates across many nodes, or advanced filtering at high vector counts.

When Should You Use One?

Vector databases shine when your application needs to:

Semantic search: search documents, emails, notes, or code by meaning
RAG: give an LLM access to a private knowledge base
Recommendations: suggest similar products, articles, or songs
Deduplication: detect near-duplicate content at scale
Multimodal search: search images or audio using text queries

If your search problem can be solved with exact keyword matching and simple filters, a traditional database is simpler and faster. But the moment you need "find me things like this," a vector database is the right tool.

The Bigger Picture

Vector databases are one of the foundational primitives of the current AI application stack, sitting alongside embedding models and LLMs. They're what allow AI systems to have long-term memory, to search over private data, and to ground their responses in real information rather than hallucinating.

Understanding how they work and when to use them is quickly becoming table stakes for anyone building with AI.

If you want to see vector search in action on your own files, privately, on-device, no cloud required that's exactly what ThinkableSpace is built for.

← All Posts