Building Multi-Robot Teams with LLM-Based Agentic AI: A Practical Guide

Overview

This guide translates the cutting-edge research from the Johns Hopkins Applied Physics Laboratory (APL) into a hands-on tutorial for developers and roboticists. We'll explore how to leverage large language models (LLMs) as the cognitive backbone for heterogeneous robot teams, enabling them to cooperate, adapt, and execute complex missions. You'll learn the core architecture, step-by-step implementation strategies, and common pitfalls to avoid. By the end, you'll have a blueprint for turning a group of individual robots into a coordinated, intelligent swarm.

Building Multi-Robot Teams with LLM-Based Agentic AI: A Practical Guide — Source: spectrum.ieee.org

Prerequisites

Before diving in, ensure you have the following foundational knowledge and tools:

Robotics Basics: Familiarity with ROS 2 (Robot Operating System), robot kinematics, and sensor integration.
AI/ML Experience: Understanding of LLM APIs (e.g., OpenAI, Claude) and prompt engineering.
Programming: Proficiency in Python (≥3.8) and experience with asynchronous programming (e.g., asyncio).
Hardware: A heterogeneous robot team (e.g., a drone, a ground rover, and a manipulator) or simulation environment (Gazebo).
Networking: Knowledge of WebSocket or MQTT for inter-robot communication.

Step-by-Step Implementation

Step 1: Define Agent Roles and Team Structure

Start by mapping the mission requirements to distinct agent roles. For a search-and-rescue scenario, assign roles like Scout (drone), Hauler (rover), and Manipulator (arm). Each agent will run an LLM-based agent that understands its capabilities, constraints, and current state. Create a shared ontology (e.g., in JSON Schema) that describes the environment, tasks, and agent capabilities.


{
  "roles": {
    "scout": {
      "capabilities": ["fly", "camera", "GPS"],
      "constraints": ["battery_30min", "no_lift"]
    },
    "hauler": {
      "capabilities": ["drive", "carry_5kg", "GPS"],
      "constraints": ["max_speed_2m/s"]
    }
  }
}

Step 2: Design the Agentic Architecture

The architecture from APL uses a layered approach: a reasoning layer (LLM-based), a planning layer (task decomposition), and an execution layer (robot-specific controllers). Implement each agent as a separate process that communicates via a message bus. The LLM acts as the central decision-maker, but is buffered by a perception module that translates sensor data into natural language observations for the LLM.

Step 3: Set Up LLM Integration

Select an LLM API and create a session manager that handles context history, token limits, and rate limits. For each agent, define a system prompt that includes its role, capabilities, and team structure. Example system prompt fragment:


"You are a Scout drone. Your team includes a Hauler rover and a Manipulator arm. Your goal is to locate victims. Report observations to the team. Use JSON format for structured responses."

Implement a function that sends the agent's observation + incoming team messages to the LLM, and parses the structured output (e.g., {"action": "move_to", "coordinates": [x,y]}).

Step 4: Implement Inter-Agent Communication

Use a publish/subscribe pattern (e.g., MQTT with a shared broker). Each agent publishes its state, intentions, and requests. The LLM-generated actions are converted to commands. Example code snippet for a Python agent using paho-mqtt:


import paho.mqtt.client as mqtt
def on_message(client, userdata, msg):
    decision = llm_process(msg.payload)
    client.publish(f"{agent_id}/command", decision)

client = mqtt.Client()
client.on_message = on_message
client.subscribe("team/observations")

Step 5: Implement Coordination Mechanisms

For teams, coordination can be either centralized (one agent acts as coordinator) or decentralized (each agent negotiates). APL's approach uses a hybrid: a lightweight centralized coordinator that resolves conflicts using an LLM. For example, when two agents both want the same resource (a recharging station), the coordinator agent runs a negotiation prompt:


"Team: Scout battery at 20%, Hauler battery at 80%. Both need the charging station. What is the optimal allocation? Respond with JSON."

The LLM output then informs the final decision, which is broadcast to all agents.

Step 6: Handle Dynamic Task Allocation

Instead of predefined plans, agents react to environmental changes. Implement a continuous observation-loop where the LLM periodically re-evaluates the team's status. Use a sliding window of recent events to keep context size manageable. Include a failover protocol: if an agent goes offline, the others detect via heartbeat and the coordinator reassigns its tasks.

Step 7: Integrate with Robot Hardware (or Simulation)

If using real robots, wrap each robot's ROS 2 nodes with an agent interface. The interface subscribes to sensor topics, converts data to text (e.g., "Battery: 30%, at (10,20), obstacle ahead"), and publishes action commands (e.g., "/cmd_vel"). For simulation, use Gazebo with ROS 2. Test each agent in isolation first, then as a team. APL demonstrated this with a drone, rover, and arm working together to locate and retrieve an object.

Step 8: Monitor and Iterate

Log all LLM calls, agent states, and decisions. Build a dashboard to visualize the team's reasoning. Common metrics: task completion time, number of LLM calls, conflict resolution success. Use these logs to refine prompts and escape routes.

Common Mistakes

Ignoring Latency: LLM calls can take 1–5 seconds. Do not call the LLM for every sensor reading. Batch observations and use periodic decision cycles (e.g., every 2 seconds).
Overloading Context: Sending full sensor streams as text quickly hits token limits. Summarize: instead of 100 lidar points, say "obstacle at 2m ahead left".
Hardcoding Robot IDs: Agents should discover each other. Use a dynamic roster via MQTT retained messages.
Ignoring Failure Modes: The LLM may produce invalid JSON or impossible commands. Implement a validator that rejects non-conforming actions and falls back to a safe state (e.g., hover for a drone).
Lack of Grounding: Ensure the LLM understands its physical constraints. For example, a rover cannot climb stairs—include this in the system prompt and error-correction loop.

Summary

Agentic AI for robot teams is still an emerging field, but the framework developed at APL provides a solid foundation. By treating LLMs as reasoning engines, combined with a modular architecture and careful communication design, you can build robot teams that are far more flexible and resilient than traditional hard-coded systems. Start small—simulate two robots with a simple task—then scale to heterogeneous swarms. The lessons from APL's ongoing research highlight the importance of robust error handling and minimal, grounded prompts. With this guide, you're equipped to begin your own experiments and push the boundaries of collaborative robotics.