Data July 8, 2023 3 min read

Vector databases: what changed and why it matters for business

Why vector databases became a topic alongside the LLM wave, and what this actually means for companies working with internal documents.

Over the past few months the word "RAG" has been appearing more and more in AI conversations, usually next to "vector database". For many people this sounds like another technical term without practical meaning. But underneath it sits a concrete architectural problem that almost any company faces when trying to apply LLMs to its own documents.

Let me explain the substance without the mathematics.

Why ordinary search does not work for LLMs

When an employee asks a question of an internal system, they expect an answer by meaning, not by keyword match. "What is our travel policy for regions with special conditions?" is a question where not a single word may match the way the relevant document is written.

Classic full-text search looks for word overlap. If the regulation says "elevated risk zones" and the employee asks about "special conditions" - the system finds nothing, or finds something irrelevant.

Vector search works differently. Each text fragment is turned into a numerical vector that encodes meaning. Texts that are close in meaning end up close together in that space. Search becomes a search by meaning, not by words.

What RAG is and why you need a vector database

RAG - retrieval-augmented generation - is a pattern in which the LLM does not try to answer from memory, but first finds relevant fragments from a knowledge base and then formulates the answer based on them.

The pattern is straightforward: a user asks a question, the system searches a vector database for matching documents, passes them to the model as context, and the model answers based on that context.

This addresses the "hallucination" problem - when an LLM answers confidently without having real data to work from. In a RAG architecture the model answers from specific documents, not from whatever accumulated in its weights during training on internet text.

The vector database here is a store optimised specifically for holding vectors and searching for nearest neighbours. It does not replace an ordinary database. It complements it where semantic search is needed.

What you actually need before the vector database

Before thinking about the technology, it is worth answering a few questions about the data.

Which documents will make up the knowledge base? Where are they stored now and in what format? Who is responsible for keeping them current?

These are not technical questions. They are questions about how the company manages its knowledge. If documents are scattered across dozens of places, some are outdated, and there are no owners - the vector database will be indexing chaos. It comes back, once again, to data readiness for AI.

The second issue is chunking. Vector search works with text fragments of a certain size, not entire documents. How to cut a 200-page regulation into chunks where each piece is coherent and self-contained is a separate problem with no universal answer.

A practical filter

Before moving to architecture, I ask a few questions:

Is there a specific internal-document search task that is being handled poorly or not at all right now?
Are the documents structured enough and current enough to be the basis of answers?
Is there someone on the team who will own the index - updating it when documents change?
Do we understand how we will check the quality of the system's answers?

If the answers exist - this is an architecture conversation. If not - first bring order to the documents.

Vector databases are a tool that has become accessible and mature right now. But a tool only works when you know exactly what it is supposed to do.

Back to all posts

Contact

Why ordinary search does not work for LLMs

What RAG is and why you need a vector database

What you actually need before the vector database

A practical filter

If this resonated, write to me. I reply personally.