agent-pard0x/multi-agent-supervisor/docs/UNDERSTANDING_TRANSFERS.md
Gaetan Hurel d33cddef1e
wip
2025-06-26 18:02:43 +02:00

7.5 KiB
Raw Blame History

Understanding Multi-Agent Transfers

What "Successfully transferred..." means

When you see messages like:

  • Successfully transferred to system_info_worker
  • Successfully transferred back to supervisor

These are tool execution results from the LangGraph supervisor pattern. Here's what's happening:

🔄 The Transfer Flow

  1. Supervisor receives user query: "Nginx returns 502 Bad Gateway on my server. What can I do?"

  2. Supervisor analyzes and delegates: Based on the SUPERVISOR_PROMPT in config.py, it decides to start with system_info_worker

  3. Transfer tool execution: Supervisor calls transfer_to_system_info_worker tool

    • Result: "Successfully transferred to system_info_worker"
    • Meaning: Control is now handed to the system_info_worker agent
  4. Agent executes: The system_info_worker gets:

    • Full conversation context (including the original user query)
    • Its own specialized prompt from agents/system_agents.py
    • Access to its tools (shell commands for system info)
  5. Agent completes and returns: Agent calls transfer_back_to_supervisor

    • Result: "Successfully transferred back to supervisor"
    • Meaning: Agent finished its task and returned control
    • Important: Agent's results are now part of the conversation history
  6. Supervisor decides next step: Based on accumulated results, supervisor either:

    • Delegates to another agent (e.g., service_inventory_worker)
    • Provides final response to user
    • Key: Supervisor can see ALL previous agent results when making decisions

🧠 How Prompts Work

Supervisor Prompt (config.py)

SUPERVISOR_PROMPT = """
You are the supervisor of a team of specialised sysadmin agents.
Decide which agent to delegate to based on the user's query **or** on results already collected.
Available agents:
- system_info_worker: gather system metrics
- service_inventory_worker: list running services  
- mariadb_analyzer: analyse MariaDB
...
Always start with `system_info_worker` and `service_inventory_worker` before drilling into a specific service.
"""

Agent Prompts (agents/*.py)

Each agent has its own specialized prompt, for example:

# system_info_worker prompt
"""
You are a Linux sysadmin. Use shell commands like `lscpu`, `free -h`, and `df -h` to gather CPU, RAM, and disk usage. 
Return a concise plaintext summary. Only run safe, readonly commands.
"""

🎯 What Each Agent Receives

When an agent is activated via transfer:

  • Full conversation history: All previous messages between user, supervisor, and other agents
  • Specialized prompt: Guides how the agent should interpret and act on the conversation
  • Tools: Shell access, specific analyzers, etc.
  • Context: Results from previous agents in the conversation

🔄 How Agent Results Flow Back to Supervisor

This is the key mechanism that makes the multi-agent system intelligent:

  1. Agent produces results: Each agent generates an AIMessage with its findings/analysis

  2. Results become part of conversation: The agent's response is added to the shared message history

  3. Supervisor sees everything: When control returns to supervisor, it has access to:

    • Original user query
    • All previous agent responses
    • Tool execution results
    • Complete conversation context
  4. Supervisor strategy updates: Based on accumulated knowledge, supervisor can:

    • Decide which agent to call next
    • Skip unnecessary agents if enough info is gathered
    • Synthesize results from multiple agents
    • Provide final comprehensive response

Example Flow:

User: "Nginx 502 error, help!"
├── Supervisor → system_info_worker
│   └── Returns: "502 usually means upstream server issues, check logs..."
├── Supervisor (now knows about upstream issues) → service_inventory_worker  
│   └── Returns: "Check PHP-FPM status, verify upstream config..."
└── Supervisor (has both perspectives) → Final synthesis
    └── "Based on system analysis and service inventory, here's comprehensive solution..."

📤 What Workers Pass Back to Supervisor

Key Insight: Workers don't explicitly "return" data. Instead, all their work becomes part of the shared conversation history that the supervisor can access.

What Gets Added to the Message History

When a worker (like network_diag) executes:

  1. AIMessages - Agent's reasoning and analysis

    "I'll start by checking external connectivity..."
    "DNS resolution appears to be working correctly..."
    "Network Analysis Summary: All systems operational..."
    
  2. ToolMessages - Raw command outputs

    "PING 8.8.8.8 (8.8.8.8): 56 data bytes\n64 bytes from 8.8.8.8..."
    "google.com. 300 IN A 142.250.80.46"
    "tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN"
    
  3. Transfer Confirmation - When worker completes

    "Successfully transferred back to supervisor"
    

Complete Message Flow Example

# After network_diag completes, state["messages"] contains:
[
    HumanMessage("My website is slow"),                        # Original query
    AIMessage("I'll check network connectivity..."),          # Supervisor decision
    ToolMessage("Successfully transferred to network_diag"),   # Transfer confirmation
    AIMessage("Starting network diagnostics..."),             # Worker starts
    ToolMessage("PING 8.8.8.8: 64 bytes from 8.8.8.8..."),  # Command result 1
    AIMessage("External connectivity is good, checking DNS"), # Worker analysis
    ToolMessage("google.com. 300 IN A 142.250.80.46"),       # Command result 2
    AIMessage("DNS working. Checking local services..."),     # Worker continues
    ToolMessage("tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN"),      # Command result 3
    AIMessage("Network Summary: All good, issue elsewhere"),  # Worker's final analysis
    ToolMessage("Successfully transferred back to supervisor") # Transfer back
]

How Supervisor Uses This Information

The supervisor receives ALL these messages and can:

  1. Read command outputs to understand technical details
  2. See agent reasoning to understand what was checked
  3. Access final analysis to make informed decisions
  4. Decide next steps based on accumulated evidence

Why This Design Works

  • Full Transparency: Supervisor sees everything the worker did
  • Rich Context: Both raw data and interpreted analysis available
  • Cumulative Knowledge: Each agent builds on previous work
  • Intelligent Routing: Supervisor can adapt strategy based on findings

Example: Multi-Agent Collaboration

User: "Website is slow"
├── network_diag finds: "Network is fine"
├── cert_checker finds: "Certificate expires tomorrow!" 
└── Supervisor synthesis: "Issue is expiring certificate, not network"

The supervisor can correlate findings across multiple workers because it sees all their work in the message history.

📋 Key Takeaways

  • "Successfully transferred" = Control handoff confirmation, not data transfer
  • Each agent gets the full conversation context INCLUDING previous agent results
  • Agent prompts determine how they process that context
  • Supervisor orchestrates the workflow based on its prompt strategy
  • The conversation builds up context as each agent contributes their expertise
  • Results accumulate: Each agent can see and build upon previous agents' work
  • Supervisor learns: Strategy updates based on what agents discover
  • Dynamic workflow: Supervisor can skip agents or change direction based on results