2025-06-29 15:15:27 +02:00

7.2 KiB

Raw Blame History

Multi-Agent Supervisor System for Sysadmin Tasks

This directory contains a sophisticated multi-agent system with a supervisor pattern for comprehensive system administration and troubleshooting.

Sources

https://github.com/langchain-ai/langgraph-supervisor-py https://langchain-ai.github.io/langgraph/concepts/multi_agent/#supervisor

Overview

The multi-agent supervisor system uses multiple specialized agents coordinated by a supervisor to handle complex sysadmin tasks:

Supervisor Agent: Orchestrates and delegates tasks to specialized workers
Specialized Workers: Each agent is an expert in a specific domain
Parallel Processing: Multiple agents can work simultaneously
Intelligent Routing: Tasks are routed to the most appropriate specialist

Architecture

User Input → Supervisor → Specialized Agents → Aggregated Response
                ↓
    ┌─────────────────────────────────────────────────┐
    │  system_info │ nginx │ mariadb │ network │ ...  │
    └─────────────────────────────────────────────────┘

Specialized Agents

Core System Agents

system_info_worker: CPU, RAM, disk usage monitoring
service_inventory_worker: Lists running services

Service-Specific Agents

mariadb_analyzer: MariaDB configuration and log analysis
nginx_analyzer: Nginx configuration validation and log analysis
phpfpm_analyzer: PHP-FPM performance and error analysis

Network & Security Agents

network_diag: Network connectivity and DNS diagnostics
cert_checker: TLS certificate validation and expiry alerts

Analysis & Action Agents

risk_scorer: Aggregates findings and assigns severity levels
remediation_worker: Proposes safe fixes for detected issues
harmonizer_worker: Applies security hardening best practices

Features

Core Capabilities

Local System Access: Execute shell commands on the local machine
Remote Server Access: Execute commands on remote servers via SSH
Persistent SSH Connections: Efficient remote operations with connection reuse
Cross-Platform Support: Works on Linux, macOS, BSD, and Windows systems

Advanced Capabilities

Intelligent Delegation: Supervisor routes tasks to appropriate specialists
Parallel Execution: Multiple agents can work simultaneously
Severity Assessment: Risk scoring with Critical/High/Medium/Low levels
Safe Remediation: Proposes fixes with confirmation requests
Security Hardening: Automated best-practice application

Execution Modes

Invoke Mode: Complete analysis with final result
Stream Mode: Real-time step-by-step execution visibility

Files

main-multi-agent.py: Complete multi-agent supervisor implementation
agents/: Directory containing specialized agent implementations
custom_tools/: Custom tools used by the agents
supervisor.py: Supervisor agent coordination logic
utils.py: Utility functions and configurations

Usage

cd multi-agent-supervisor
python main-multi-agent.py

The script includes both execution modes:

1. Invoke Mode (Complete Analysis)

result = supervisor.invoke(query)
print(result["messages"][-1]["content"])

2. Stream Mode (Step-by-Step)

for chunk in supervisor.stream(query):
    # Real-time agent execution monitoring
    print(f"🤖 ACTIVE AGENT: {current_agent}")
    print(f"🔧 TOOL CALLS: {len(tool_calls)} tool(s)")

Example Workflow

For the query: "Nginx returns 502 Bad Gateway on my server. What can I do?"

Supervisor analyzes the request
system_info_worker checks system resources (local or remote)
service_inventory_worker lists running services
nginx_analyzer validates Nginx configuration and checks logs
phpfpm_analyzer checks PHP-FPM status (common 502 cause)
risk_scorer assesses the severity
remediation_worker proposes specific fixes

Example Queries

The multi-agent system can handle both local and remote system administration:

Local System Administration

"Check local system performance and identify bottlenecks"
"Analyze recent system errors in local logs"
"What services are running on this machine?"

Remote Server Management

"Connect to my remote server and check disk usage"
"Compare performance between local and remote systems"  
"Check if nginx is running on the remote server"
"Analyze logs on my remote server for error patterns"

Multi-System Analysis

"Perform comprehensive health check across all systems"
"Compare configurations between local and remote servers"
"Identify performance differences between environments"

Pros and Cons

✅ Pros

Domain Expertise: Each agent specializes in specific areas
Parallel Processing: Multiple agents work simultaneously
Comprehensive Analysis: Systematic approach to complex problems
Risk Assessment: Built-in severity scoring
Intelligent Routing: Tasks go to the right specialist
Scalable: Easy to add new specialized agents

❌ Cons

Complexity: More sophisticated setup and debugging
Resource Intensive: Higher computational overhead
Coordination Overhead: Supervisor management complexity
Potential Over-engineering: May be overkill for simple tasks

When to Use

Choose the multi-agent supervisor when:

You need comprehensive system analysis
Multiple services/components are involved
You want parallel processing capabilities
Risk assessment and severity scoring are important
You're dealing with complex, multi-faceted problems
You need specialized domain expertise

Agent Interaction Flow

graph TD
    A[User Query] --> B[Supervisor]
    B --> C[system_info_worker]
    B --> D[service_inventory_worker]
    B --> E[Service Specialists]
    E --> F[nginx_analyzer]
    E --> G[mariadb_analyzer]
    E --> H[phpfpm_analyzer]
    C --> I[risk_scorer]
    D --> I
    F --> I
    G --> I
    H --> I
    I --> J[remediation_worker]
    J --> K[Final Response]

Customization

Adding New Agents

new_agent = create_react_agent(
    model="openai:gpt-4o-mini",
    tools=[shell_tool, custom_tools],
    prompt="Your specialized agent prompt...",
    name="new_specialist"
)

# Add to supervisor
supervisor = create_supervisor(
    agents=[...existing_agents, new_agent],
    model=model,
    prompt=updated_supervisor_prompt
)

Custom Tools

class CustomTool(BaseTool):
    name = "custom_tool"
    description = "Tool description"
    
    def _run(self, **kwargs):
        # Tool implementation
        return result

Requirements

pip install langchain-openai langgraph langgraph-supervisor langchain-community
export OPENAI_API_KEY="your-api-key"

Performance Considerations

Token Usage: Higher due to multiple agent interactions
Execution Time: May be longer due to coordination overhead
Memory: Higher memory usage with multiple concurrent agents
Rate Limits: Monitor API rate limits with parallel requests

7.2 KiB Raw Blame History