Skip to main content

Design a Excel Workbook (Spreadsheet with Formula)

Problem Statement

Design a lightweight, in-memory spreadsheet evaluation engine (similar to the core formula evaluation system in Microsoft Excel). The spreadsheet must manage a dynamic grid of cells addressable by alphanumeric coordinates (such as A1, B2). Cells can contain either static numeric values or formulas that reference other cells (e.g., =A1+B1). When a cell's value is modified, all other cells that reference it must recalculate their values automatically to ensure the spreadsheet remains consistently up to date.

Asked In Companies
Microsoft Google Apple

Design Decisions & Patterns Used

Building a spreadsheet engine requires a clean way to manage relationships between cells. If cell C1 contains a formula pointing to A1 and B1, any change to A1 or B1 must trigger an update in C1. To handle this interaction without tightly coupling the cells together, we apply the Observer Pattern.

In this design:

  • A cell acts as a Subject (Publisher) to the cells that reference it. It maintains a list of observer cells and notifies them whenever its value changes.
  • A cell acts as an Observer (Subscriber) to the cells it references in its formula. It listens for value changes in those cells and triggers its own recalculation when notified.
  • We apply the Composite Pattern by designing the Cell class to hide the internal difference between a formula cell and a constant value cell. The external workbook invokes getNumericValue() uniformly, regardless of how the cell is constructed.

Functional Requirements

  • Maintain a dynamic grid of cells addressable by coordinate strings (e.g., "A1", "B5").
  • Support assigning raw inputs to cells, including constant numbers (e.g., "15.5") and formulas (e.g., "=A1+B1").
  • Identify and extract cell coordinates referenced inside formulas dynamically.
  • Recalculate cells automatically and propagate the changes downstream to all observer cells whenever a referenced cell's value is modified.

Objects Required

  • Cell (The core domain object representing a single spreadsheet cell, managing its values, formula references, and observer list)
  • Workbook (The coordinator class acting as the spreadsheet grid, parsing raw formulas, matching cell relationships, and managing updates)

Cell Class (Subject & Observer)

The Cell class is the building block of our spreadsheet. It represents a single cell in the grid and holds references to both the cells it observes (referenced inputs) and the cells that observe it (observer outputs).


import java.util.HashSet;
import java.util.Set;

public class Cell {
    private final String id;
    private String rawValue;
    private double numericValue;
    
    // The cells this cell references in its formula (Inputs)
    private final Set<Cell> referencedCells;
    
    // The cells that reference this cell (Outputs/Observers)
    private final Set<Cell> observers;

    public Cell(String id) {
        this.id = id;
        this.rawValue = "";
        this.numericValue = 0.0;
        this.referencedCells = new HashSet<>();
        this.observers = new HashSet<>();
    }

    public synchronized void addObserver(Cell observer) {
        observers.add(observer);
    }

    public synchronized void removeObserver(Cell observer) {
        observers.remove(observer);
    }

    public synchronized void addReference(Cell referenceCell) {
        referencedCells.add(referenceCell);
        referenceCell.addObserver(this);
    }

    public synchronized void clearReferences() {
        for (Cell ref : referencedCells) {
            ref.removeObserver(this);
        }
        referencedCells.clear();
    }

    public synchronized void evaluate() {
        if (!rawValue.startsWith("=")) {
            try {
                this.numericValue = rawValue.isEmpty() ? 0.0 : Double.parseDouble(rawValue);
            } catch (NumberFormatException e) {
                this.numericValue = 0.0; // Fallback to 0.0 for non-numeric plain text
            }
        } else {
            // Formula Evaluation: Sum the values of all referenced cells
            double sum = 0.0;
            for (Cell ref : referencedCells) {
                sum += ref.getNumericValue();
            }
            this.numericValue = sum;
        }
    }

    public String getId() { return id; }
    
    public synchronized String getRawValue() { return rawValue; }
    public synchronized void setRawValue(String rawValue) { this.rawValue = rawValue; }

    public synchronized double getNumericValue() { return numericValue; }
    
    public synchronized Set<Cell> getReferencedCells() { return new HashSet<>(referencedCells); }
    public synchronized Set<Cell> getObservers() { return new HashSet<>(observers); }
}

Let's break down the logic of every method in the Cell class:

  • Cell(id): The constructor initializes the cell with its unique identifier. It sets up empty HashSet collections for references and observers. Working with sets ensures we don't accidentally register duplicate observer relationships.
  • addObserver(observer): Adds another cell to this cell's observer set. When this cell's value changes, this observer will be notified to recalculate. This method is synchronized to ensure thread safety.
  • removeObserver(observer): Removes a cell from the observer set. This is used when a dependent cell is updated with a new formula that no longer references this cell.
  • addReference(referenceCell): Links this cell to a referenced cell. It adds the target cell to our local referencedCells set and registers this cell as an observer of the target cell.
  • clearReferences(): Clears all existing reference associations. It iterates through the referenced cells, removes this cell from their observer sets, and clears the local reference set. This is critical when changing a cell's formula to prevent memory leaks and outdated updates.
  • evaluate(): Evaluates the cell's value. If the raw value is a constant, it parses it into a double. If it is a formula (starts with =), it sums the values of all referenced cells.

Workbook Class

The Workbook class coordinates updates, parses formulas using regular expressions, and propagates changes recursively down the dependency tree.


import java.util.*;
import java.util.concurrent.ConcurrentHashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Workbook {
    private final Map<String, Cell> grid;
    private static final Pattern CELL_REF_PATTERN = Pattern.compile("[A-Z]+\\d+");

    public Workbook() {
        this.grid = new ConcurrentHashMap<>();
    }

    public synchronized Cell getCell(String id) {
        String normalizedId = id.toUpperCase();
        return grid.computeIfAbsent(normalizedId, Cell::new);
    }

    public synchronized void setCellValue(String id, String rawValue) {
        Cell cell = getCell(id);
        
        // Clear references associated with the old formula/value
        cell.clearReferences();
        cell.setRawValue(rawValue);

        if (rawValue.startsWith("=")) {
            // Extract referenced cell coordinates (e.g., =A1+B1 -> A1, B1)
            Matcher matcher = CELL_REF_PATTERN.matcher(rawValue.toUpperCase());
            while (matcher.find()) {
                String refId = matcher.group();
                // Avoid registering self-references
                if (!refId.equals(cell.getId())) {
                    cell.addReference(getCell(refId));
                }
            }
        }

        // Recalculate values and propagate changes downstream
        recalculate(cell);
    }

    private void recalculate(Cell cell) {
        cell.evaluate();
        System.out.println("Cell " + cell.getId() + " recalculated. New Value: " + cell.getNumericValue());
        
        // Propagate updates recursively to all observer cells
        for (Cell observer : cell.getObservers()) {
            recalculate(observer);
        }
    }
}

Let's break down the logic of every method in the Workbook class:

  • Workbook(): The constructor initializes the grid map using a thread-safe ConcurrentHashMap to support concurrent updates.
  • getCell(id): Retrieves the cell with the specified ID. It normalizes the ID to uppercase to prevent case-sensitivity issues, and initializes the cell lazily if it does not already exist in the grid.
  • setCellValue(id, rawValue): Updates a cell's value. It clears the cell's old references, sets the new raw value, parses formula structures using regex to link referenced cells, and triggers the recalculation process.
  • recalculate(cell): Evaluates the cell's value and recursively triggers updates on all observer cells down the tree. This ensures changes propagate immediately throughout the sheet.

Main Driver Class

This class tests our spreadsheet engine. It sets constant values, defines formulas, and verifies update propagation.


public class Main {
    public static void main(String[] args) {
        Workbook workbook = new Workbook();

        System.out.println("==========================================");
        System.out.println("Scenario 1: Setting Constant Values and Formulas");
        System.out.println("==========================================");

        workbook.setCellValue("A1", "10");
        workbook.setCellValue("B1", "20");
        
        // Set formula: C1 = A1 + B1 (Expected: 30)
        System.out.println("\nSetting formula C1 = A1 + B1:");
        workbook.setCellValue("C1", "=A1+B1");

        // Set formula: D1 = C1 (Expected: 30)
        System.out.println("\nSetting formula D1 = C1:");
        workbook.setCellValue("D1", "=C1");

        System.out.println("\n--- Scenario 2: Modifying A1 (Updates should propagate to C1 and D1) ---");
        workbook.setCellValue("A1", "30"); // A1 becomes 30, C1 becomes 50, D1 becomes 50

        // Verify values
        System.out.println("\nVerify final C1 value: " + workbook.getCell("C1").getNumericValue());
        System.out.println("Verify final D1 value: " + workbook.getCell("D1").getNumericValue());
    }
}

The main() driver configures values, establishes formula chains, and modifies root cells to verify change propagation.


Also See

Comments

Popular posts from this blog

Designing a Parking Lot - Low Level Design

Problem Statement Design a parking lot that can handle vehicles entering and leaving while managing parking across multiple floors. Each vehicle should be assigned a suitable parking spot based on its type, and the spot should be freed once the vehicle exits. The design should also support generating a ticket at entry and optionally calculating the parking fee based on the duration of stay. Asked In Companies Amazon Google Microsoft Uber Walmart Flipkart Meta PayPal Oracle Salesforce Adobe Apple Intuit LinkedIn Atlassian Functional Requirements The design should support multiple vehicle types such as bikes, cars, and trucks A vehicle must be assigned a parking spot compatible with its type A parking spot cannot be assigned to more than one vehicle at a time The parking lot should support multiple levels (floors) The design should search and allocate an availa...

Most Frequently Asked Low Level Design(LLD) Interview Questions

Below are the curated list of most commonly asked Low Level Design (LLD) interview problems. Each problem includes a short description and a link to the complete solution with code and class diagrams. Design Parking Lot System The system should handle parking for different vehicle types such as bikes, cars, and trucks. It should manage slot allocation, availability tracking, and entry/exit flow. The design also ensures efficient usage of parking space under varying load conditions. View Solution Design Elevator / Lift System The system should support multiple elevators operating across floors with request handling logic. It focuses on scheduling algorithms to minimize wait time and optimize movement. It also manages direction control and concurrent floor requests. View Solution Design Movie Ticket Booking System The system should allow users to browse movies, select shows, and book seats. It handles seat ...

Software Design Patterns for LLD Interviews: A Complete Guide

Software Design Patterns for LLD Interviews: A Complete Guide In Software Development Engineer (SDE) interviews—especially for mid-level and senior roles—low-level design (LLD) rounds assess your ability to write clean, reusable, maintainable, and extensible code. The foundation of resolving these architectural challenges lies in the standard Gang of Four (GoF) Design Patterns. Rather than memorizing theoretical definitions, interviewers expect you to apply these patterns to real-world scenarios, identifying the trade-offs of each. Below is a comprehensive guide to the 12 most frequently asked design patterns in LLD interviews, categorized by their classification (Creational, Structural, and Behavioral). Each pattern contains a concrete, real-world Java implementation and a detailed breakdown of design decisions. Creational Design Patterns Creational design patterns deal with object creation mechanisms. They abstract the instantiation process, making a system independent of how...