# Multi-Agent Supervisor System for Sysadmin Tasks

This directory contains a sophisticated multi-agent system with a supervisor pattern for comprehensive system administration and troubleshooting.

## Sources

https://github.com/langchain-ai/langgraph-supervisor-py
https://langchain-ai.github.io/langgraph/concepts/multi_agent/#supervisor

## Overview

The multi-agent supervisor system uses multiple specialized agents coordinated by a supervisor to handle complex sysadmin tasks:

1. **Supervisor Agent**: Orchestrates and delegates tasks to specialized workers
2. **Specialized Workers**: Each agent is an expert in a specific domain
3. **Parallel Processing**: Multiple agents can work simultaneously
4. **Intelligent Routing**: Tasks are routed to the most appropriate specialist

## Architecture

```
User Input → Supervisor → Specialized Agents → Aggregated Response
                ↓
    ┌─────────────────────────────────────────────────┐
    │  system_info │ nginx │ mariadb │ network │ ...  │
    └─────────────────────────────────────────────────┘
```

## Specialized Agents

### Core System Agents
- **`system_info_worker`**: CPU, RAM, disk usage monitoring
- **`service_inventory_worker`**: Lists running services

### Service-Specific Agents
- **`mariadb_analyzer`**: MariaDB configuration and log analysis
- **`nginx_analyzer`**: Nginx configuration validation and log analysis
- **`phpfpm_analyzer`**: PHP-FPM performance and error analysis

### Network & Security Agents
- **`network_diag`**: Network connectivity and DNS diagnostics
- **`cert_checker`**: TLS certificate validation and expiry alerts

### Analysis & Action Agents
- **`risk_scorer`**: Aggregates findings and assigns severity levels
- **`remediation_worker`**: Proposes safe fixes for detected issues
- **`harmonizer_worker`**: Applies security hardening best practices

## Features

### Advanced Capabilities
- **Intelligent Delegation**: Supervisor routes tasks to appropriate specialists
- **Parallel Execution**: Multiple agents can work simultaneously
- **Severity Assessment**: Risk scoring with Critical/High/Medium/Low levels
- **Safe Remediation**: Proposes fixes with confirmation requests
- **Security Hardening**: Automated best-practice application

### Execution Modes
- **Invoke Mode**: Complete analysis with final result
- **Stream Mode**: Real-time step-by-step execution visibility

## Files

- `main-multi-agent.py`: Complete multi-agent supervisor implementation
- `agents/`: Directory containing specialized agent implementations
- `custom_tools/`: Custom tools used by the agents
- `supervisor.py`: Supervisor agent coordination logic
- `utils.py`: Utility functions and configurations

## Usage

```bash
cd multi-agent-supervisor
python main-multi-agent.py
```

The script includes both execution modes:

### 1. Invoke Mode (Complete Analysis)
```python
result = supervisor.invoke(query)
print(result["messages"][-1]["content"])
```

### 2. Stream Mode (Step-by-Step)
```python
for chunk in supervisor.stream(query):
    # Real-time agent execution monitoring
    print(f"🤖 ACTIVE AGENT: {current_agent}")
    print(f"🔧 TOOL CALLS: {len(tool_calls)} tool(s)")
```

## Example Workflow

For the query: *"Nginx returns 502 Bad Gateway on my server. What can I do?"*

1. **Supervisor** analyzes the request
2. **system_info_worker** checks system resources
3. **service_inventory_worker** lists running services
4. **nginx_analyzer** validates Nginx configuration and checks logs
5. **phpfpm_analyzer** checks PHP-FPM status (common 502 cause)
6. **risk_scorer** assesses the severity
7. **remediation_worker** proposes specific fixes

## Pros and Cons

### ✅ Pros
- **Domain Expertise**: Each agent specializes in specific areas
- **Parallel Processing**: Multiple agents work simultaneously
- **Comprehensive Analysis**: Systematic approach to complex problems
- **Risk Assessment**: Built-in severity scoring
- **Intelligent Routing**: Tasks go to the right specialist
- **Scalable**: Easy to add new specialized agents

### ❌ Cons
- **Complexity**: More sophisticated setup and debugging
- **Resource Intensive**: Higher computational overhead
- **Coordination Overhead**: Supervisor management complexity
- **Potential Over-engineering**: May be overkill for simple tasks

## When to Use

Choose the multi-agent supervisor when:
- You need comprehensive system analysis
- Multiple services/components are involved
- You want parallel processing capabilities
- Risk assessment and severity scoring are important
- You're dealing with complex, multi-faceted problems
- You need specialized domain expertise

## Agent Interaction Flow

```mermaid
graph TD
    A[User Query] --> B[Supervisor]
    B --> C[system_info_worker]
    B --> D[service_inventory_worker]
    B --> E[Service Specialists]
    E --> F[nginx_analyzer]
    E --> G[mariadb_analyzer]
    E --> H[phpfpm_analyzer]
    C --> I[risk_scorer]
    D --> I
    F --> I
    G --> I
    H --> I
    I --> J[remediation_worker]
    J --> K[Final Response]
```

## Customization

### Adding New Agents
```python
new_agent = create_react_agent(
    model="openai:gpt-4o-mini",
    tools=[shell_tool, custom_tools],
    prompt="Your specialized agent prompt...",
    name="new_specialist"
)

# Add to supervisor
supervisor = create_supervisor(
    agents=[...existing_agents, new_agent],
    model=model,
    prompt=updated_supervisor_prompt
)
```

### Custom Tools
```python
class CustomTool(BaseTool):
    name = "custom_tool"
    description = "Tool description"
    
    def _run(self, **kwargs):
        # Tool implementation
        return result
```

## Requirements

```bash
pip install langchain-openai langgraph langgraph-supervisor langchain-community
export OPENAI_API_KEY="your-api-key"
```

## Performance Considerations

- **Token Usage**: Higher due to multiple agent interactions
- **Execution Time**: May be longer due to coordination overhead
- **Memory**: Higher memory usage with multiple concurrent agents
- **Rate Limits**: Monitor API rate limits with parallel requests