Skip to main content

Design an AI Agent Tool Execution Coordinator

Problem Statement

Design an autonomous AI Agent Tool Execution Coordinator (similar to the orchestration cores of LangChain, Semantic Kernel, or Claude Desktop agent runtimes). The engine must coordinate agents by parsing structured tool call requests emitted by a Language Model (LLM), invoking corresponding local Java code methods dynamically by name using reflection, appending execution outputs back to the context history, and driving the agentic loop iteratively until a final synthesized text answer is generated. Additionally, to prevent token limit errors in long-running sessions, the engine must perform proactive context-window pruning, dropping the oldest historical messages while strictly preserving the system instructions template at the head of the context.

Asked In Companies
OpenAI Anthropic Google Microsoft

Design Decisions & Patterns Used

Modern agentic AI systems use dynamic tool execution to interact with external databases, APIs, and file systems. When an LLM generates a tool request, the agent loop must intercept, run, and feedback those results. To design this in Java, we need a reflective tool registry, a memory manager to hold context history, and an execution filter pipeline.

We will utilize the following Design Patterns:

  • Mediator Pattern: The AgentCoordinator acts as a mediator to coordinate conversation logs, token limits, dynamic tools execution, and LLM responses without components having tight couplings.
  • Command Pattern: Encapsulating local methods into interchangeable ToolCommand blocks executing dynamically based on dynamic arguments maps.
  • Chain of Responsibility: Processing conversation messages through serial pipeline processors (e.g., token size calculations, memory context window checks, and active pruning).

Functional Requirements

  • Dynamically register methods in any service class instance as tools using reflection and custom annotations.
  • Execute registered tools by resolving dynamic parameter maps and executing the underlying methods reflectively.
  • Enforce token size limitations, calculating message weights and pruning the oldest messages when context boundaries are breached.
  • Implement a System Prompt Preservation Strategy, guaranteeing that root instructions are never removed during pruning.
  • Manage iterative multi-turn loops to support nested tool calls before returning a final answer.

Objects Required

  • AgentTool, ToolParam (Custom annotations marking tool methods and parameter metadata)
  • ChatMessage (Value object capturing role, content, tool ID, and token size weight)
  • ConversationMemory (State manager holding conversation logs and pruning window slots)
  • ToolCommand (Interface abstracting tool operations)
  • ReflectiveToolCommand (Concrete implementation running methods via reflection)
  • ToolRegistry (Repository class loading and resolving tool execution mappings)
  • MessageProcessor (Pipeline filter interface processing message lifecycle boundaries)
  • AgentCoordinator (Core mediator coordinating agent iteration flows)

AgentTool & ToolParam Annotations

The AgentTool and ToolParam annotations allow developers to mark methods and parameters for automatic reflection indexing at application startup.


import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface AgentTool {
    String name();
    String description();
}

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.PARAMETER)
public @interface ToolParam {
    String name();
    String description() default "";
}

The @AgentTool annotation registers the tool with the agent loop by name. The @ToolParam annotation preserves the dynamic mapping names of parameter arguments because the Java compiler may strip parameter names unless compiled with the -parameters flag.


ChatMessage Class

The ChatMessage class represents a single chat record inside the conversation history, cataloging the role (SYSTEM, USER, ASSISTANT, TOOL), raw text, and token count.


public class ChatMessage {
    private final MessageRole role;
    private final String content;
    private final String toolCallId; // Used only for TOOL responses and ASSISTANT calls
    private int tokenCount;

    public ChatMessage(MessageRole role, String content) {
        this(role, content, null);
    }

    public ChatMessage(MessageRole role, String content, String toolCallId) {
        this.role = role;
        this.content = content;
        this.toolCallId = toolCallId;
        this.tokenCount = 0;
    }

    public MessageRole getRole() { return role; }
    public String getContent() { return content; }
    public String getToolCallId() { return toolCallId; }
    public int getTokenCount() { return tokenCount; }
    public void setTokenCount(int tokenCount) { this.tokenCount = tokenCount; }
}

public enum MessageRole {
    SYSTEM, USER, ASSISTANT, TOOL
}

The constructor assigns fields, initializing tokenCount to zero. This count is computed later in the pipeline before committing the message to the active window.


ConversationMemory Class

The ConversationMemory class holds the conversation logs. It computes active memory footprint weights and handles context-window pruning while preserving system prompts.


import java.util.ArrayList;
import java.util.List;

public class ConversationMemory {
    private final List<ChatMessage> history = new ArrayList<>();
    private final TokenCounter tokenCounter;

    public ConversationMemory(TokenCounter tokenCounter) {
        this.tokenCounter = tokenCounter;
    }

    public synchronized void addMessage(ChatMessage message) {
        if (message.getTokenCount() <= 0) {
            message.setTokenCount(tokenCounter.countTokens(message.getContent()));
        }
        history.add(message);
    }

    public synchronized List<ChatMessage> getMessages() {
        return new ArrayList<>(history);
    }

    public synchronized int getTotalTokenCount() {
        return history.stream().mapToInt(ChatMessage::getTokenCount).sum();
    }

    public synchronized void prune(int maxTokenLimit) {
        while (getTotalTokenCount() > maxTokenLimit && history.size() > 1) {
            int indexToPrune = -1;

            // System Prompt Preservation Strategy: Skip index 0 if it is SYSTEM
            if (history.get(0).getRole() == MessageRole.SYSTEM) {
                if (history.size() > 1) {
                    indexToPrune = 1; // Prune oldest non-system message
                }
            } else {
                indexToPrune = 0; // Prune oldest message
            }

            if (indexToPrune != -1) {
                ChatMessage removed = history.remove(indexToPrune);
                System.out.println("[Memory] Pruned old message: " + removed.getRole() + 
                                   " | Freed " + removed.getTokenCount() + " tokens.");
            } else {
                break;
            }
        }
    }
}

public interface TokenCounter {
    int countTokens(String text);
}

Let's break down the logic of every method in the ConversationMemory class:

  • addMessage(message): Adds a message to history, calculating token sizes using a pluggable counter.
  • getTotalTokenCount(): Iterates over all history entries to aggregate the current token weight.
  • prune(maxTokenLimit): Monitors total weights against limits. It removes the oldest historical messages to fit the limit but bypasses the system instruction prompt at index 0.

ToolCommand Interface & ReflectiveToolCommand Class

We apply the **Command Pattern** to encapsulate tools. The ReflectiveToolCommand maps parameter schema definitions to target methods and reflectively invokes them.


import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.lang.reflect.Parameter;
import java.util.Map;

public interface ToolCommand {
    String getName();
    String getDescription();
    String execute(Map<String, Object> args) throws Exception;
}

public class ReflectiveToolCommand implements ToolCommand {
    private final Object targetInstance;
    private final Method method;
    private final String name;
    private final String description;

    public ReflectiveToolCommand(Object targetInstance, Method method, String name, String description) {
        this.targetInstance = targetInstance;
        this.method = method;
        this.name = name;
        this.description = description;
        this.method.setAccessible(true);
    }

    @Override
    public String getName() { return name; }

    @Override
    public String getDescription() { return description; }

    @Override
    public String execute(Map<String, Object> args) throws Exception {
        Parameter[] parameters = method.getParameters();
        Object[] invokeArgs = new Object[parameters.length];

        for (int i = 0; i < parameters.length; i++) {
            Parameter param = parameters[i];
            ToolParam paramAnn = param.getAnnotation(ToolParam.class);
            String key = (paramAnn != null) ? paramAnn.name() : param.getName();

            Object rawValue = args.get(key);

            if (rawValue != null) {
                Class<?> type = param.getType();
                if (type == int.class || type == Integer.class) {
                    invokeArgs[i] = ((Number) rawValue).intValue();
                } else if (type == double.class || type == Double.class) {
                    invokeArgs[i] = ((Number) rawValue).doubleValue();
                } else if (type == boolean.class || type == Boolean.class) {
                    invokeArgs[i] = Boolean.valueOf(rawValue.toString());
                } else {
                    invokeArgs[i] = type.cast(rawValue);
                }
            } else {
                invokeArgs[i] = null; // Unspecified parameters map as null
            }
        }

        try {
            Object res = method.invoke(targetInstance, invokeArgs);
            return res != null ? res.toString() : "Success";
        } catch (InvocationTargetException e) {
            throw new Exception("Tool execution error: " + e.getCause().getMessage(), e.getCause());
        }
    }
}

Let's break down the logic of the methods in ReflectiveToolCommand:

  • execute(args): Inspects method signatures, maps argument keys using annotations, performs type conversions, and invokes the target method reflectively.

ToolRegistry Class

The ToolRegistry registers and stores tools, scanning classes at startup to build reflective command objects automatically.


import java.lang.reflect.Method;
import java.util.HashMap;
import java.util.Map;

public class ToolRegistry {
    private final Map<String, ToolCommand> tools = new HashMap<>();

    public void registerTool(ToolCommand tool) {
        tools.put(tool.getName(), tool);
        System.out.println("Registered Tool Command: " + tool.getName());
    }

    public void registerToolsFromObject(Object serviceInstance) {
        Class<?> clazz = serviceInstance.getClass();
        for (Method method : clazz.getDeclaredMethods()) {
            if (method.isAnnotationPresent(AgentTool.class)) {
                AgentTool ann = method.getAnnotation(AgentTool.class);
                ToolCommand cmd = new ReflectiveToolCommand(serviceInstance, method, ann.name(), ann.description());
                tools.put(ann.name(), cmd);
                System.out.println("Reflectively Registered Tool: '" + ann.name() + "' from " + method.getName());
            }
        }
    }

    public ToolCommand getTool(String name) { return tools.get(name); }
    public boolean hasTool(String name) { return tools.containsKey(name); }
}

The registerToolsFromObject() method checks a class's methods for the @AgentTool annotation and registers them. The coordinator can query and invoke these registered methods by name.


MessageProcessor Pipeline (Chain of Responsibility)

We apply the **Chain of Responsibility** pattern to process messages through sequential steps, separating token calculation and memory management logic.


public interface MessageProcessor {
    void setNext(MessageProcessor next);
    void process(ChatMessage message, ConversationMemory memory, int maxTokenLimit);
}

public abstract class BaseMessageProcessor implements MessageProcessor {
    protected MessageProcessor next;

    @Override
    public void setNext(MessageProcessor next) {
        this.next = next;
    }

    protected void executeNext(ChatMessage message, ConversationMemory memory, int maxTokenLimit) {
        if (next != null) {
            next.process(message, memory, maxTokenLimit);
        }
    }
}

public class TokenCalculationProcessor extends BaseMessageProcessor {
    private final TokenCounter tokenCounter;

    public TokenCalculationProcessor(TokenCounter tokenCounter) {
        this.tokenCounter = tokenCounter;
    }

    @Override
    public void process(ChatMessage message, ConversationMemory memory, int maxTokenLimit) {
        int count = tokenCounter.countTokens(message.getContent());
        message.setTokenCount(count);
        System.out.println("[Pipeline] Computed " + message.getRole() + " weight: " + count + " tokens.");
        executeNext(message, memory, maxTokenLimit);
    }
}

public class ContextPruningProcessor extends BaseMessageProcessor {
    @Override
    public void process(ChatMessage message, ConversationMemory memory, int maxTokenLimit) {
        memory.addMessage(message);
        int total = memory.getTotalTokenCount();
        System.out.println("[Pipeline] Added message to memory. Footprint: " + total + "/" + maxTokenLimit);
        if (total > maxTokenLimit) {
            System.out.println("[Pipeline] Footprint limit exceeded. Pruning...");
            memory.prune(maxTokenLimit);
        }
        executeNext(message, memory, maxTokenLimit);
    }
}

The pipeline processes messages by routing them through TokenCalculationProcessor to compute weight and ContextPruningProcessor to add the message to history and prune if necessary.


LlmResponse and MockLlmClient Components

The MockLlmClient simulates a language model that generates responses and tool calls.


import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class ToolCallRequest {
    private final String id;
    private final String toolName;
    private final Map<String, Object> arguments;

    public ToolCallRequest(String id, String toolName, Map<String, Object> arguments) {
        this.id = id;
        this.toolName = toolName;
        this.arguments = arguments;
    }

    public String getId() { return id; }
    public String getToolName() { return toolName; }
    public Map<String, Object> getArguments() { return arguments; }
}

public class LlmResponse {
    private final String textResponse;
    private final List<ToolCallRequest> toolCalls;

    public LlmResponse(String textResponse, List<ToolCallRequest> toolCalls) {
        this.textResponse = textResponse;
        this.toolCalls = toolCalls;
    }

    public String getTextResponse() { return textResponse; }
    public List<ToolCallRequest> getToolCalls() { return toolCalls; }
    public boolean hasToolCalls() { return toolCalls != null && !toolCalls.isEmpty(); }
}

public class MockLlmClient {
    public LlmResponse generateResponse(List<ChatMessage> history) {
        ChatMessage lastMessage = history.get(history.size() - 1);

        if (lastMessage.getRole() == MessageRole.USER) {
            String prompt = lastMessage.getContent().toLowerCase();
            if (prompt.contains("weather") && prompt.contains("berlin")) {
                List<ToolCallRequest> calls = new ArrayList<>();
                Map<String, Object> args = new HashMap<>();
                args.put("location", "Berlin");
                calls.add(new ToolCallRequest("call-w01", "get_weather", args));
                return new LlmResponse(null, calls);
            } else if (prompt.contains("analyze") && prompt.contains("system")) {
                List<ToolCallRequest> calls = new ArrayList<>();
                Map<String, Object> args1 = new HashMap<>();
                args1.put("logType", "ERROR");
                calls.add(new ToolCallRequest("call-t01", "read_logs", args1));

                Map<String, Object> args2 = new HashMap<>();
                args2.put("metric", "CPU");
                calls.add(new ToolCallRequest("call-t02", "get_system_metric", args2));
                return new LlmResponse(null, calls);
            } else {
                return new LlmResponse("I am an AI assistant. How can I help you today?", null);
            }
        } else if (lastMessage.getRole() == MessageRole.TOOL) {
            StringBuilder text = new StringBuilder("Analysis summary:\n");
            for (ChatMessage msg : history) {
                if (msg.getRole() == MessageRole.TOOL) {
                    text.append(" - [").append(msg.getToolCallId()).append("] Output: ").append(msg.getContent()).append("\n");
                }
            }
            text.append("All operations completed successfully.");
            return new LlmResponse(text.toString(), null);
        }
        return new LlmResponse("Processed successfully.", null);
    }
}

The MockLlmClient inspects the conversational state. If a user asks for weather info or diagnostic logs, it outputs tool calls. Once these tools are executed and the results appended to the history, the model returns a final text summary.


AgentCoordinator Class

The AgentCoordinator acts as the mediator to coordinate conversations, tool calls, and LLM responses.


import java.util.List;
import java.util.Map;

public class AgentCoordinator {
    private final ToolRegistry toolRegistry;
    private final ConversationMemory memory;
    private final MockLlmClient llmClient;
    private final MessageProcessor pipeline;
    private final int maxTokenLimit;
    private final int maxIterations = 5;

    public AgentCoordinator(ToolRegistry toolRegistry, ConversationMemory memory, 
                            MockLlmClient llmClient, MessageProcessor pipeline, int maxTokenLimit) {
        this.toolRegistry = toolRegistry;
        this.memory = memory;
        this.llmClient = llmClient;
        this.pipeline = pipeline;
        this.maxTokenLimit = maxTokenLimit;
    }

    public String run(String userPrompt) {
        System.out.println("\n[Coordinator] Starting execution loop for: \"" + userPrompt + "\"");
        
        ChatMessage userMsg = new ChatMessage(MessageRole.USER, userPrompt);
        pipeline.process(userMsg, memory, maxTokenLimit);

        int loops = 0;
        while (loops < maxIterations) {
            loops++;
            System.out.println("\n--- [Coordinator] Execution loop iteration: " + loops + " ---");

            List<ChatMessage> activeHistory = memory.getMessages();
            LlmResponse response = llmClient.generateResponse(activeHistory);

            if (response.hasToolCalls()) {
                System.out.println("[Coordinator] LLM requested tool execution. Registering execution intention...");
                
                StringBuilder tracker = new StringBuilder("Intention to invoke: ");
                for (ToolCallRequest req : response.getToolCalls()) {
                    tracker.append(req.getToolName()).append(" ");
                }
                ChatMessage assistantIntention = new ChatMessage(MessageRole.ASSISTANT, tracker.toString());
                pipeline.process(assistantIntention, memory, maxTokenLimit);

                for (ToolCallRequest toolCall : response.getToolCalls()) {
                    String name = toolCall.getToolName();
                    Map<String, Object> args = toolCall.getArguments();
                    String callId = toolCall.getId();
                    String output;

                    if (toolRegistry.hasTool(name)) {
                        try {
                            ToolCommand cmd = toolRegistry.getTool(name);
                            System.out.println("[Coordinator] Executing " + name + " dynamically with reflection...");
                            output = cmd.execute(args);
                        } catch (Exception e) {
                            output = "Execution failure: " + e.getMessage();
                        }
                    } else {
                        output = "Error: Tool not found.";
                    }

                    System.out.println("[Coordinator] Tool response: " + output);
                    ChatMessage toolResponseMsg = new ChatMessage(MessageRole.TOOL, output, callId);
                    pipeline.process(toolResponseMsg, memory, maxTokenLimit);
                }
            } else {
                String reply = response.getTextResponse();
                ChatMessage assistantFinalMsg = new ChatMessage(MessageRole.ASSISTANT, reply);
                pipeline.process(assistantFinalMsg, memory, maxTokenLimit);
                return reply;
            }
        }
        return "Failure: Maximum execution iteration count exceeded.";
    }
}

Let's break down the logic of the run(userPrompt) method:

  • run(userPrompt): Passes the user prompt through the processing pipeline. It runs an agent loop, queries the LLM, executes requested tool calls reflectively, and appends outputs back to the conversation memory until the LLM returns a final response.

Main Driver Class

This class tests our Agent Coordinator by registering a local tool service, setting context token limits, and running single and parallel tool execution flows to verify context pruning.


import java.util.List;

class LocalToolsService {
    @AgentTool(name = "get_weather", description = "Query weather details for location.")
    public String checkWeather(@ToolParam(name = "location") String location) {
        System.out.println("  [Tool Execution] checkWeather called with location: " + location);
        if ("Berlin".equalsIgnoreCase(location)) {
            return "17°C, Cloudy with rain showers.";
        }
        return "Unknown location.";
    }

    @AgentTool(name = "read_logs", description = "Query local system errors.")
    public String checkErrors(@ToolParam(name = "logType") String logType) {
        System.out.println("  [Tool Execution] checkErrors called with type: " + logType);
        if ("ERROR".equalsIgnoreCase(logType)) {
            return "[ERR-101] DB connection deadlock detected.\n[ERR-502] Bad gateway.";
        }
        return "No errors.";
    }

    @AgentTool(name = "get_system_metric", description = "Query system CPU.")
    public String checkCpu(@ToolParam(name = "metric") String metric) {
        System.out.println("  [Tool Execution] checkCpu called with metric: " + metric);
        if ("CPU".equalsIgnoreCase(metric)) {
            return "CPU Usage: 84.5%.";
        }
        return "Unknown metric.";
    }
}

class SimpleWordTokenCounter implements TokenCounter {
    @Override
    public int countTokens(String text) {
        if (text == null || text.trim().isEmpty()) return 0;
        return (int) Math.ceil(text.split("\\s+").length * 1.35);
    }
}

public class Main {
    public static void main(String[] args) {
        ToolRegistry registry = new ToolRegistry();
        LocalToolsService service = new LocalToolsService();
        registry.registerToolsFromObject(service);

        TokenCounter counter = new SimpleWordTokenCounter();
        ConversationMemory memory = new ConversationMemory(counter);

        // Seed initial system instructions. This message must never be pruned!
        ChatMessage systemPrompt = new ChatMessage(MessageRole.SYSTEM, 
            "Analyze operations using local tools. Always summarize findings.");
        memory.addMessage(systemPrompt);

        // Configure pipeline chain
        MessageProcessor calculator = new TokenCalculationProcessor(counter);
        MessageProcessor pruner = new ContextPruningProcessor();
        calculator.setNext(pruner);

        // Limit active token capacity to demonstrate pruning behavior
        int tokenLimit = 160;

        AgentCoordinator coordinator = new AgentCoordinator(
            registry, memory, new MockLlmClient(), calculator, tokenLimit
        );

        System.out.println("\n==========================================");
        System.out.println("Scenario 1: Single Reflective Tool Call");
        System.out.println("==========================================");
        String finalAnswer1 = coordinator.run("What is the weather in Berlin?");
        System.out.println("\nFinal Result:\n" + finalAnswer1);

        System.out.println("\n==========================================");
        System.out.println("Scenario 2: Parallel Tool Calls with Context Pruning");
        System.out.println("==========================================");
        String finalAnswer2 = coordinator.run("Analyze the system load and errors.");
        System.out.println("\nFinal Result:\n" + finalAnswer2);

        System.out.println("\n==========================================");
        System.out.println("Scenario 3: Verifying Conversation History Status");
        System.out.println("==========================================");
        System.out.println("Current active memory history:");
        for (ChatMessage msg : memory.getMessages()) {
            System.out.println(" - [" + msg.getRole() + "] (Weight: " + msg.getTokenCount() + ") " + msg.getContent());
        }
        System.out.println("Total Active Weight: " + memory.getTotalTokenCount() + "/" + tokenLimit + " tokens.");
    }
}

The main() driver initializes the coordinator, registers annotated tools, seeds the conversation memory with system instructions, and runs user queries to verify dynamic execution and selective pruning.


Also See

Comments

Popular posts from this blog

Designing a Parking Lot - Low Level Design

Problem Statement Design a parking lot that can handle vehicles entering and leaving while managing parking across multiple floors. Each vehicle should be assigned a suitable parking spot based on its type, and the spot should be freed once the vehicle exits. The design should also support generating a ticket at entry and optionally calculating the parking fee based on the duration of stay. Asked In Companies Amazon Google Microsoft Uber Walmart Flipkart Meta PayPal Oracle Salesforce Adobe Apple Intuit LinkedIn Atlassian Functional Requirements The design should support multiple vehicle types such as bikes, cars, and trucks A vehicle must be assigned a parking spot compatible with its type A parking spot cannot be assigned to more than one vehicle at a time The parking lot should support multiple levels (floors) The design should search and allocate an availa...

Most Frequently Asked Low Level Design(LLD) Interview Questions

Below are the curated list of most commonly asked Low Level Design (LLD) interview problems. Each problem includes a short description and a link to the complete solution with code and class diagrams. Design Parking Lot System The system should handle parking for different vehicle types such as bikes, cars, and trucks. It should manage slot allocation, availability tracking, and entry/exit flow. The design also ensures efficient usage of parking space under varying load conditions. View Solution Design Elevator / Lift System The system should support multiple elevators operating across floors with request handling logic. It focuses on scheduling algorithms to minimize wait time and optimize movement. It also manages direction control and concurrent floor requests. View Solution Design Movie Ticket Booking System The system should allow users to browse movies, select shows, and book seats. It handles seat ...

Software Design Patterns for LLD Interviews: A Complete Guide

Software Design Patterns for LLD Interviews: A Complete Guide In Software Development Engineer (SDE) interviews—especially for mid-level and senior roles—low-level design (LLD) rounds assess your ability to write clean, reusable, maintainable, and extensible code. The foundation of resolving these architectural challenges lies in the standard Gang of Four (GoF) Design Patterns. Rather than memorizing theoretical definitions, interviewers expect you to apply these patterns to real-world scenarios, identifying the trade-offs of each. Below is a comprehensive guide to the 12 most frequently asked design patterns in LLD interviews, categorized by their classification (Creational, Structural, and Behavioral). Each pattern contains a concrete, real-world Java implementation and a detailed breakdown of design decisions. Creational Design Patterns Creational design patterns deal with object creation mechanisms. They abstract the instantiation process, making a system independent of how...