Multi-Agent Supervisor System for Sysadmin Tasks
This directory contains a sophisticated multi-agent system with a supervisor pattern for comprehensive system administration and troubleshooting.
Sources
https://github.com/langchain-ai/langgraph-supervisor-py https://langchain-ai.github.io/langgraph/concepts/multi_agent/#supervisor
Overview
The multi-agent supervisor system uses multiple specialized agents coordinated by a supervisor to handle complex sysadmin tasks:
- Supervisor Agent: Orchestrates and delegates tasks to specialized workers
- Specialized Workers: Each agent is an expert in a specific domain
- Parallel Processing: Multiple agents can work simultaneously
- Intelligent Routing: Tasks are routed to the most appropriate specialist
Architecture
User Input → Supervisor → Specialized Agents → Aggregated Response
↓
┌─────────────────────────────────────────────────┐
│ system_info │ nginx │ mariadb │ network │ ... │
└─────────────────────────────────────────────────┘
Specialized Agents
Core System Agents
system_info_worker
: CPU, RAM, disk usage monitoringservice_inventory_worker
: Lists running services
Service-Specific Agents
mariadb_analyzer
: MariaDB configuration and log analysisnginx_analyzer
: Nginx configuration validation and log analysisphpfpm_analyzer
: PHP-FPM performance and error analysis
Network & Security Agents
network_diag
: Network connectivity and DNS diagnosticscert_checker
: TLS certificate validation and expiry alerts
Analysis & Action Agents
risk_scorer
: Aggregates findings and assigns severity levelsremediation_worker
: Proposes safe fixes for detected issuesharmonizer_worker
: Applies security hardening best practices
Features
Core Capabilities
- Local System Access: Execute shell commands on the local machine
- Remote Server Access: Execute commands on remote servers via SSH
- Persistent SSH Connections: Efficient remote operations with connection reuse
- Cross-Platform Support: Works on Linux, macOS, BSD, and Windows systems
Advanced Capabilities
- Intelligent Delegation: Supervisor routes tasks to appropriate specialists
- Parallel Execution: Multiple agents can work simultaneously
- Severity Assessment: Risk scoring with Critical/High/Medium/Low levels
- Safe Remediation: Proposes fixes with confirmation requests
- Security Hardening: Automated best-practice application
Execution Modes
- Invoke Mode: Complete analysis with final result
- Stream Mode: Real-time step-by-step execution visibility
Files
main-multi-agent.py
: Complete multi-agent supervisor implementationagents/
: Directory containing specialized agent implementationscustom_tools/
: Custom tools used by the agentssupervisor.py
: Supervisor agent coordination logicutils.py
: Utility functions and configurations
Usage
cd multi-agent-supervisor
python main-multi-agent.py
The script includes both execution modes:
1. Invoke Mode (Complete Analysis)
result = supervisor.invoke(query)
print(result["messages"][-1]["content"])
2. Stream Mode (Step-by-Step)
for chunk in supervisor.stream(query):
# Real-time agent execution monitoring
print(f"🤖 ACTIVE AGENT: {current_agent}")
print(f"🔧 TOOL CALLS: {len(tool_calls)} tool(s)")
Example Workflow
For the query: "Nginx returns 502 Bad Gateway on my server. What can I do?"
- Supervisor analyzes the request
- system_info_worker checks system resources (local or remote)
- service_inventory_worker lists running services
- nginx_analyzer validates Nginx configuration and checks logs
- phpfpm_analyzer checks PHP-FPM status (common 502 cause)
- risk_scorer assesses the severity
- remediation_worker proposes specific fixes
Example Queries
The multi-agent system can handle both local and remote system administration:
Local System Administration
"Check local system performance and identify bottlenecks"
"Analyze recent system errors in local logs"
"What services are running on this machine?"
Remote Server Management
"Connect to my remote server and check disk usage"
"Compare performance between local and remote systems"
"Check if nginx is running on the remote server"
"Analyze logs on my remote server for error patterns"
Multi-System Analysis
"Perform comprehensive health check across all systems"
"Compare configurations between local and remote servers"
"Identify performance differences between environments"
Pros and Cons
✅ Pros
- Domain Expertise: Each agent specializes in specific areas
- Parallel Processing: Multiple agents work simultaneously
- Comprehensive Analysis: Systematic approach to complex problems
- Risk Assessment: Built-in severity scoring
- Intelligent Routing: Tasks go to the right specialist
- Scalable: Easy to add new specialized agents
❌ Cons
- Complexity: More sophisticated setup and debugging
- Resource Intensive: Higher computational overhead
- Coordination Overhead: Supervisor management complexity
- Potential Over-engineering: May be overkill for simple tasks
When to Use
Choose the multi-agent supervisor when:
- You need comprehensive system analysis
- Multiple services/components are involved
- You want parallel processing capabilities
- Risk assessment and severity scoring are important
- You're dealing with complex, multi-faceted problems
- You need specialized domain expertise
Agent Interaction Flow
graph TD
A[User Query] --> B[Supervisor]
B --> C[system_info_worker]
B --> D[service_inventory_worker]
B --> E[Service Specialists]
E --> F[nginx_analyzer]
E --> G[mariadb_analyzer]
E --> H[phpfpm_analyzer]
C --> I[risk_scorer]
D --> I
F --> I
G --> I
H --> I
I --> J[remediation_worker]
J --> K[Final Response]
Customization
Adding New Agents
new_agent = create_react_agent(
model="openai:gpt-4o-mini",
tools=[shell_tool, custom_tools],
prompt="Your specialized agent prompt...",
name="new_specialist"
)
# Add to supervisor
supervisor = create_supervisor(
agents=[...existing_agents, new_agent],
model=model,
prompt=updated_supervisor_prompt
)
Custom Tools
class CustomTool(BaseTool):
name = "custom_tool"
description = "Tool description"
def _run(self, **kwargs):
# Tool implementation
return result
Requirements
pip install langchain-openai langgraph langgraph-supervisor langchain-community
export OPENAI_API_KEY="your-api-key"
Performance Considerations
- Token Usage: Higher due to multiple agent interactions
- Execution Time: May be longer due to coordination overhead
- Memory: Higher memory usage with multiple concurrent agents
- Rate Limits: Monitor API rate limits with parallel requests