Skip to main content

Design an In-Memory Vector Search Engine (Vector Database-lite)

Problem Statement

Design an in-memory high-dimensional Vector Search Engine (similar to the retrieval cores of Pinecone or Milvus). The engine must store vector embeddings (arrays of floating-point numbers representing semantic text meaning), support registering custom similarity distance algorithms (like Cosine Similarity or Euclidean Distance), and execute K-Nearest Neighbors (K-NN) query searches using priority queues to retrieve the Top-K matching documents with their corresponding similarity scores.

Asked In Companies
Pinecone OpenAI Google AWS

Design Decisions & Patterns Used

Modern generative AI systems use vector databases to implement Retrieval-Augmented Generation (RAG) and long-term memory. When an LLM needs context, we convert the query into a vector embedding and search our database for the closest match. To design this in Java, we need a flat structure of vector entries, a similarity metric interface to calculate scores, and a priority queue to filter and sort the highest-scoring records efficiently.

We will utilize the following Design Patterns:

  • Strategy Pattern: Defining interchangeable algorithms (strategies) to calculate vector distance and similarity scores (e.g., Cosine Similarity, Dot Product, and Euclidean Distance).
  • Iterator Pattern: Standardizing list traversals during similarity calculations across all database records.

Functional Requirements

  • Register vector records containing unique IDs, float embedding arrays, and plain text metadata.
  • Support different similarity distance metrics: Cosine Similarity and Dot Product.
  • Execute K-Nearest Neighbors (K-NN) searches to retrieve the Top-K closest vector records for a query vector.
  • Sort results in descending order of similarity, returning both the records and their calculated similarity scores.
  • Ensure thread safety during concurrent vector write insertions and queries.

Objects Required

  • VectorRecord (Data class representing a database row)
  • SimilarityStrategy (Interface defining vector math calculation contracts)
  • CosineSimilarity, DotProduct (Concrete math calculation strategies)
  • SearchResult (Value wrapper pairing vector records with their calculated scores)
  • VectorDatabase (Core database engine executing additions and K-NN searches)

VectorRecord Class

The VectorRecord class represents a database row. It stores a unique ID, a high-dimensional float array (embedding), and related text metadata.


public class VectorRecord {
    private final String id;
    private final float[] embedding;
    private final String metadata;

    public VectorRecord(String id, float[] embedding, String metadata) {
        this.id = id;
        this.embedding = embedding.clone(); // Clone to prevent external mutations
        this.metadata = metadata;
    }

    public String getId() { return id; }
    public float[] getEmbedding() { return embedding; }
    public String getMetadata() { return metadata; }
}

The constructor clones the input float array to enforce immutability, preventing other threads from modifying the vector coordinates after registration.


SimilarityStrategy Interface & Implementations

We apply the **Strategy Pattern** to keep vector similarity calculations pluggable. This allows the system to swap math algorithms easily depending on whether the embeddings are normalized.


public interface SimilarityStrategy {
    double calculate(float[] v1, float[] v2);
}

Let's implement the concrete similarity strategies: CosineSimilarity and DotProduct.


public class CosineSimilarity implements SimilarityStrategy {
    @Override
    public double calculate(float[] v1, float[] v2) {
        if (v1.length != v2.length) {
            throw new IllegalArgumentException("Vector dimension mismatch: " + v1.length + " vs " + v2.length);
        }

        double dotProduct = 0.0;
        double normA = 0.0;
        double normB = 0.0;

        for (int i = 0; i < v1.length; i++) {
            dotProduct += v1[i] * v2[i];
            normA += v1[i] * v1[i];
            normB += v2[i] * v2[i];
        }

        if (normA == 0.0 || normB == 0.0) return 0.0; // Avoid division by zero
        return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
    }
}

The CosineSimilarity class evaluates the angular similarity between vectors. It calculates the dot product and divides it by the product of their magnitudes (norms), returning a value between -1.0 and 1.0.


public class DotProduct implements SimilarityStrategy {
    @Override
    public double calculate(float[] v1, float[] v2) {
        if (v1.length != v2.length) {
            throw new IllegalArgumentException("Vector dimension mismatch: " + v1.length + " vs " + v2.length);
        }

        double product = 0.0;
        for (int i = 0; i < v1.length; i++) {
            product += v1[i] * v2[i];
        }
        return product;
    }
}

The DotProduct strategy calculates the scalar product of two vectors, which is ideal when working with pre-normalized vector embeddings.


SearchResult Class

The SearchResult class is a helper class that pairs a matching VectorRecord with its calculated similarity score, implementing the Comparable interface to support sorting in priority queues.


public class SearchResult implements Comparable<SearchResult> {
    private final VectorRecord record;
    private final double score;

    public SearchResult(VectorRecord record, double score) {
        this.record = record;
        this.score = score;
    }

    public VectorRecord getRecord() { return record; }
    public double getScore() { return score; }

    @Override
    public int compareTo(SearchResult other) {
        // Sort in descending order of similarity score (highest score first)
        return Double.compare(other.score, this.score);
    }
}

The overridden compareTo() method sorts search results by score in descending order, ensuring that high-similarity items bubble to the top of priority queues.


VectorDatabase Class

The VectorDatabase class manages the vector records and executes K-NN search queries using a priority queue.


import java.util.ArrayList;
import java.util.List;
import java.util.PriorityQueue;
import java.util.concurrent.CopyOnWriteArrayList;

public class VectorDatabase {
    private final List<VectorRecord> registry;

    public VectorDatabase() {
        this.registry = new CopyOnWriteArrayList<>();
    }

    public void addRecord(VectorRecord record) {
        registry.add(record);
        System.out.println("Registered Vector Record: '" + record.getId() + "' - Dimension: " + record.getEmbedding().length);
    }

    public List<SearchResult> query(float[] queryVector, int k, SimilarityStrategy strategy) {
        if (registry.isEmpty()) {
            return new ArrayList<>();
        }

        // Priority Queue to store and sort results (highest similarity score first)
        PriorityQueue<SearchResult> queue = new PriorityQueue<>();

        // Calculate similarity score for all registered records
        for (VectorRecord record : registry) {
            double score = strategy.calculate(queryVector, record.getEmbedding());
            queue.add(new SearchResult(record, score));
        }

        // Extract the Top-K matching results
        List<SearchResult> results = new ArrayList<>();
        int limit = Math.min(k, queue.size());
        for (int i = 0; i < limit; i++) {
            results.add(queue.poll());
        }

        return results;
    }
}

Let's break down the logic of every method in the VectorDatabase class:

  • VectorDatabase(): The constructor initializes the vector registry using a thread-safe CopyOnWriteArrayList to support concurrent writes and queries.
  • addRecord(record): Adds a vector record to the registry list.
  • query(queryVector, k, strategy): Executes a K-NN query search. It iterates through the registered records, calculates their similarity scores using the provided strategy, adds the results to a priority queue, and extracts and returns the top k matching items.

Main Driver Class

This class tests our in-memory vector database by registering documents with 3-dimensional embeddings and querying them using Cosine Similarity.


import java.util.List;

public class Main {
    public static void main(String[] args) {
        VectorDatabase db = new VectorDatabase();

        System.out.println("==========================================");
        System.out.println("Scenario 1: Registering Vector Records");
        System.out.println("==========================================");

        // Register documents with mock semantic embeddings
        db.addRecord(new VectorRecord("doc-1", new float[]{1.0f, 0.0f, 0.0f}, "Semantic: Apple iPhone"));
        db.addRecord(new VectorRecord("doc-2", new float[]{0.0f, 1.0f, 0.0f}, "Semantic: Banana Fruit"));
        db.addRecord(new VectorRecord("doc-3", new float[]{0.8f, 0.1f, 0.0f}, "Semantic: Apple iPad"));

        System.out.println("\n==========================================");
        System.out.println("Scenario 2: Querying the Vector Database (Top-2)");
        System.out.println("==========================================");

        // Query vector represents a device close to "Apple iPhone" (e.g., [0.9f, 0.0f, 0.0f])
        float[] queryVector = new float[]{0.9f, 0.0f, 0.0f};
        SimilarityStrategy cosine = new CosineSimilarity();

        System.out.println("Querying database with vector: [0.9, 0.0, 0.0]...");
        List<SearchResult> matches = db.query(queryVector, 2, cosine);

        for (int i = 0; i < matches.size(); i++) {
            SearchResult match = matches.get(i);
            System.out.println((i + 1) + ". Match: '" + match.getRecord().getId() + 
                    "' | Score: " + String.format("%.4f", match.getScore()) + 
                    " | Content: " + match.getRecord().getMetadata());
        }
    }
}

The main() driver configures the vector database, registers mock semantic documents, queries the database with a search vector, and prints the top 2 matching documents with their similarity scores.


Also See

Comments

Popular posts from this blog

Designing a Parking Lot - Low Level Design

Problem Statement Design a parking lot that can handle vehicles entering and leaving while managing parking across multiple floors. Each vehicle should be assigned a suitable parking spot based on its type, and the spot should be freed once the vehicle exits. The design should also support generating a ticket at entry and optionally calculating the parking fee based on the duration of stay. Asked In Companies Amazon Google Microsoft Uber Walmart Flipkart Meta PayPal Oracle Salesforce Adobe Apple Intuit LinkedIn Atlassian Functional Requirements The design should support multiple vehicle types such as bikes, cars, and trucks A vehicle must be assigned a parking spot compatible with its type A parking spot cannot be assigned to more than one vehicle at a time The parking lot should support multiple levels (floors) The design should search and allocate an availa...

Most Frequently Asked Low Level Design(LLD) Interview Questions

Below are the curated list of most commonly asked Low Level Design (LLD) interview problems. Each problem includes a short description and a link to the complete solution with code and class diagrams. Design Parking Lot System The system should handle parking for different vehicle types such as bikes, cars, and trucks. It should manage slot allocation, availability tracking, and entry/exit flow. The design also ensures efficient usage of parking space under varying load conditions. View Solution Design Elevator / Lift System The system should support multiple elevators operating across floors with request handling logic. It focuses on scheduling algorithms to minimize wait time and optimize movement. It also manages direction control and concurrent floor requests. View Solution Design Movie Ticket Booking System The system should allow users to browse movies, select shows, and book seats. It handles seat ...

Software Design Patterns for LLD Interviews: A Complete Guide

Software Design Patterns for LLD Interviews: A Complete Guide In Software Development Engineer (SDE) interviews—especially for mid-level and senior roles—low-level design (LLD) rounds assess your ability to write clean, reusable, maintainable, and extensible code. The foundation of resolving these architectural challenges lies in the standard Gang of Four (GoF) Design Patterns. Rather than memorizing theoretical definitions, interviewers expect you to apply these patterns to real-world scenarios, identifying the trade-offs of each. Below is a comprehensive guide to the 12 most frequently asked design patterns in LLD interviews, categorized by their classification (Creational, Structural, and Behavioral). Each pattern contains a concrete, real-world Java implementation and a detailed breakdown of design decisions. Creational Design Patterns Creational design patterns deal with object creation mechanisms. They abstract the instantiation process, making a system independent of how...