implement 2 strategies
This commit is contained in:
90
multi-agent-supervisor/README-modular.md
Normal file
90
multi-agent-supervisor/README-modular.md
Normal file
@@ -0,0 +1,90 @@
|
||||
# Multi-Agent Sysadmin Assistant
|
||||
|
||||
A modular multi-agent system for system administration tasks using LangChain and LangGraph.
|
||||
|
||||
## Architecture
|
||||
|
||||
The system is organized into several modules for better maintainability:
|
||||
|
||||
### 📁 Project Structure
|
||||
|
||||
```
|
||||
multi-agent-supervisor/
|
||||
├── main-multi-agent.py # Main entry point
|
||||
├── config.py # Configuration and settings
|
||||
├── supervisor.py # Supervisor orchestration
|
||||
├── utils.py # Utility functions
|
||||
├── requirements.txt # Dependencies
|
||||
├── custom_tools/ # Custom tool implementations
|
||||
│ ├── __init__.py
|
||||
│ ├── log_tail_tool.py # Log reading tool
|
||||
│ └── shell_tool_wrapper.py # Shell tool wrapper
|
||||
└── agents/ # Agent definitions
|
||||
├── __init__.py
|
||||
├── system_agents.py # System monitoring agents
|
||||
├── service_agents.py # Service-specific agents
|
||||
├── network_agents.py # Network and security agents
|
||||
└── analysis_agents.py # Analysis and remediation agents
|
||||
```
|
||||
|
||||
## Agents
|
||||
|
||||
### System Agents
|
||||
- **System Info Worker**: Gathers CPU, RAM, and disk usage
|
||||
- **Service Inventory Worker**: Lists running services
|
||||
|
||||
### Service Agents
|
||||
- **MariaDB Analyzer**: Checks MariaDB configuration and logs
|
||||
- **Nginx Analyzer**: Validates Nginx configuration and logs
|
||||
- **PHP-FPM Analyzer**: Monitors PHP-FPM status and performance
|
||||
|
||||
### Network Agents
|
||||
- **Network Diagnostics**: Uses ping, traceroute, and dig
|
||||
- **Certificate Checker**: Monitors TLS certificate expiration
|
||||
|
||||
### Analysis Agents
|
||||
- **Risk Scorer**: Aggregates findings and assigns severity levels
|
||||
- **Remediation Worker**: Proposes safe fixes for issues
|
||||
- **Harmonizer Worker**: Applies system hardening best practices
|
||||
|
||||
## Benefits of Modular Architecture
|
||||
|
||||
1. **Separation of Concerns**: Each module has a single responsibility
|
||||
2. **Reusability**: Tools and agents can be easily reused across projects
|
||||
3. **Maintainability**: Easy to update individual components
|
||||
4. **Testability**: Each module can be tested independently
|
||||
5. **Scalability**: Easy to add new agents or tools
|
||||
6. **Code Organization**: Clear structure makes navigation easier
|
||||
|
||||
## Usage
|
||||
|
||||
```python
|
||||
from supervisor import create_sysadmin_supervisor
|
||||
|
||||
# Create supervisor with all agents
|
||||
supervisor = create_sysadmin_supervisor()
|
||||
|
||||
# Run analysis
|
||||
query = {
|
||||
"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Check if my web server is running properly"
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
result = supervisor.invoke(query)
|
||||
```
|
||||
|
||||
## Adding New Agents
|
||||
|
||||
1. Create agent function in appropriate module under `agents/`
|
||||
2. Import and add to supervisor in `supervisor.py`
|
||||
3. Update supervisor prompt in `config.py`
|
||||
|
||||
## Adding New Tools
|
||||
|
||||
1. Create tool class in `custom_tools/`
|
||||
2. Export from `custom_tools/__init__.py`
|
||||
3. Import and use in agent definitions
|
185
multi-agent-supervisor/README.md
Normal file
185
multi-agent-supervisor/README.md
Normal file
@@ -0,0 +1,185 @@
|
||||
# Multi-Agent Supervisor System for Sysadmin Tasks
|
||||
|
||||
This directory contains a sophisticated multi-agent system with a supervisor pattern for comprehensive system administration and troubleshooting.
|
||||
|
||||
## Overview
|
||||
|
||||
The multi-agent supervisor system uses multiple specialized agents coordinated by a supervisor to handle complex sysadmin tasks:
|
||||
|
||||
1. **Supervisor Agent**: Orchestrates and delegates tasks to specialized workers
|
||||
2. **Specialized Workers**: Each agent is an expert in a specific domain
|
||||
3. **Parallel Processing**: Multiple agents can work simultaneously
|
||||
4. **Intelligent Routing**: Tasks are routed to the most appropriate specialist
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
User Input → Supervisor → Specialized Agents → Aggregated Response
|
||||
↓
|
||||
┌─────────────────────────────────────────────────┐
|
||||
│ system_info │ nginx │ mariadb │ network │ ... │
|
||||
└─────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Specialized Agents
|
||||
|
||||
### Core System Agents
|
||||
- **`system_info_worker`**: CPU, RAM, disk usage monitoring
|
||||
- **`service_inventory_worker`**: Lists running services
|
||||
|
||||
### Service-Specific Agents
|
||||
- **`mariadb_analyzer`**: MariaDB configuration and log analysis
|
||||
- **`nginx_analyzer`**: Nginx configuration validation and log analysis
|
||||
- **`phpfpm_analyzer`**: PHP-FPM performance and error analysis
|
||||
|
||||
### Network & Security Agents
|
||||
- **`network_diag`**: Network connectivity and DNS diagnostics
|
||||
- **`cert_checker`**: TLS certificate validation and expiry alerts
|
||||
|
||||
### Analysis & Action Agents
|
||||
- **`risk_scorer`**: Aggregates findings and assigns severity levels
|
||||
- **`remediation_worker`**: Proposes safe fixes for detected issues
|
||||
- **`harmonizer_worker`**: Applies security hardening best practices
|
||||
|
||||
## Features
|
||||
|
||||
### Advanced Capabilities
|
||||
- **Intelligent Delegation**: Supervisor routes tasks to appropriate specialists
|
||||
- **Parallel Execution**: Multiple agents can work simultaneously
|
||||
- **Severity Assessment**: Risk scoring with Critical/High/Medium/Low levels
|
||||
- **Safe Remediation**: Proposes fixes with confirmation requests
|
||||
- **Security Hardening**: Automated best-practice application
|
||||
|
||||
### Execution Modes
|
||||
- **Invoke Mode**: Complete analysis with final result
|
||||
- **Stream Mode**: Real-time step-by-step execution visibility
|
||||
|
||||
## Files
|
||||
|
||||
- `main-multi-agent.py`: Complete multi-agent supervisor implementation
|
||||
- `loghub/`: Symbolic link to log files directory
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
cd multi-agent-supervisor
|
||||
python main-multi-agent.py
|
||||
```
|
||||
|
||||
The script includes both execution modes:
|
||||
|
||||
### 1. Invoke Mode (Complete Analysis)
|
||||
```python
|
||||
result = supervisor.invoke(query)
|
||||
print(result["messages"][-1]["content"])
|
||||
```
|
||||
|
||||
### 2. Stream Mode (Step-by-Step)
|
||||
```python
|
||||
for chunk in supervisor.stream(query):
|
||||
# Real-time agent execution monitoring
|
||||
print(f"🤖 ACTIVE AGENT: {current_agent}")
|
||||
print(f"🔧 TOOL CALLS: {len(tool_calls)} tool(s)")
|
||||
```
|
||||
|
||||
## Example Workflow
|
||||
|
||||
For the query: *"Nginx returns 502 Bad Gateway on my server. What can I do?"*
|
||||
|
||||
1. **Supervisor** analyzes the request
|
||||
2. **system_info_worker** checks system resources
|
||||
3. **service_inventory_worker** lists running services
|
||||
4. **nginx_analyzer** validates Nginx configuration and checks logs
|
||||
5. **phpfpm_analyzer** checks PHP-FPM status (common 502 cause)
|
||||
6. **risk_scorer** assesses the severity
|
||||
7. **remediation_worker** proposes specific fixes
|
||||
|
||||
## Pros and Cons
|
||||
|
||||
### ✅ Pros
|
||||
- **Domain Expertise**: Each agent specializes in specific areas
|
||||
- **Parallel Processing**: Multiple agents work simultaneously
|
||||
- **Comprehensive Analysis**: Systematic approach to complex problems
|
||||
- **Risk Assessment**: Built-in severity scoring
|
||||
- **Intelligent Routing**: Tasks go to the right specialist
|
||||
- **Scalable**: Easy to add new specialized agents
|
||||
|
||||
### ❌ Cons
|
||||
- **Complexity**: More sophisticated setup and debugging
|
||||
- **Resource Intensive**: Higher computational overhead
|
||||
- **Coordination Overhead**: Supervisor management complexity
|
||||
- **Potential Over-engineering**: May be overkill for simple tasks
|
||||
|
||||
## When to Use
|
||||
|
||||
Choose the multi-agent supervisor when:
|
||||
- You need comprehensive system analysis
|
||||
- Multiple services/components are involved
|
||||
- You want parallel processing capabilities
|
||||
- Risk assessment and severity scoring are important
|
||||
- You're dealing with complex, multi-faceted problems
|
||||
- You need specialized domain expertise
|
||||
|
||||
## Agent Interaction Flow
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[User Query] --> B[Supervisor]
|
||||
B --> C[system_info_worker]
|
||||
B --> D[service_inventory_worker]
|
||||
B --> E[Service Specialists]
|
||||
E --> F[nginx_analyzer]
|
||||
E --> G[mariadb_analyzer]
|
||||
E --> H[phpfpm_analyzer]
|
||||
C --> I[risk_scorer]
|
||||
D --> I
|
||||
F --> I
|
||||
G --> I
|
||||
H --> I
|
||||
I --> J[remediation_worker]
|
||||
J --> K[Final Response]
|
||||
```
|
||||
|
||||
## Customization
|
||||
|
||||
### Adding New Agents
|
||||
```python
|
||||
new_agent = create_react_agent(
|
||||
model="openai:gpt-4o-mini",
|
||||
tools=[shell_tool, custom_tools],
|
||||
prompt="Your specialized agent prompt...",
|
||||
name="new_specialist"
|
||||
)
|
||||
|
||||
# Add to supervisor
|
||||
supervisor = create_supervisor(
|
||||
agents=[...existing_agents, new_agent],
|
||||
model=model,
|
||||
prompt=updated_supervisor_prompt
|
||||
)
|
||||
```
|
||||
|
||||
### Custom Tools
|
||||
```python
|
||||
class CustomTool(BaseTool):
|
||||
name = "custom_tool"
|
||||
description = "Tool description"
|
||||
|
||||
def _run(self, **kwargs):
|
||||
# Tool implementation
|
||||
return result
|
||||
```
|
||||
|
||||
## Requirements
|
||||
|
||||
```bash
|
||||
pip install langchain-openai langgraph langgraph-supervisor langchain-community
|
||||
export OPENAI_API_KEY="your-api-key"
|
||||
```
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
- **Token Usage**: Higher due to multiple agent interactions
|
||||
- **Execution Time**: May be longer due to coordination overhead
|
||||
- **Memory**: Higher memory usage with multiple concurrent agents
|
||||
- **Rate Limits**: Monitor API rate limits with parallel requests
|
143
multi-agent-supervisor/UNDERSTANDING_TRANSFERS.md
Normal file
143
multi-agent-supervisor/UNDERSTANDING_TRANSFERS.md
Normal file
@@ -0,0 +1,143 @@
|
||||
# Understanding Multi-Agent Transfers
|
||||
|
||||
## What "Successfully transferred..." means
|
||||
|
||||
When you see messages like:
|
||||
- `Successfully transferred to system_info_worker`
|
||||
- `Successfully transferred back to supervisor`
|
||||
|
||||
These are **tool execution results** from the LangGraph supervisor pattern. Here's what's happening:
|
||||
|
||||
## 🔄 The Transfer Flow
|
||||
|
||||
1. **Supervisor receives user query**: "Nginx returns 502 Bad Gateway on my server. What can I do?"
|
||||
|
||||
2. **Supervisor analyzes and delegates**: Based on the `SUPERVISOR_PROMPT` in `config.py`, it decides to start with `system_info_worker`
|
||||
|
||||
3. **Transfer tool execution**: Supervisor calls `transfer_to_system_info_worker` tool
|
||||
- **Result**: "Successfully transferred to system_info_worker"
|
||||
- **Meaning**: Control is now handed to the system_info_worker agent
|
||||
|
||||
4. **Agent executes**: The `system_info_worker` gets:
|
||||
- Full conversation context (including the original user query)
|
||||
- Its own specialized prompt from `agents/system_agents.py`
|
||||
- Access to its tools (shell commands for system info)
|
||||
|
||||
5. **Agent completes and returns**: Agent calls `transfer_back_to_supervisor`
|
||||
- **Result**: "Successfully transferred back to supervisor"
|
||||
- **Meaning**: Agent finished its task and returned control
|
||||
- **Important**: Agent's results are now part of the conversation history
|
||||
|
||||
6. **Supervisor decides next step**: Based on **accumulated results**, supervisor either:
|
||||
- Delegates to another agent (e.g., `service_inventory_worker`)
|
||||
- Provides final response to user
|
||||
- **Key**: Supervisor can see ALL previous agent results when making decisions
|
||||
|
||||
## 🧠 How Prompts Work
|
||||
|
||||
### Supervisor Prompt (config.py)
|
||||
```python
|
||||
SUPERVISOR_PROMPT = """
|
||||
You are the supervisor of a team of specialised sysadmin agents.
|
||||
Decide which agent to delegate to based on the user's query **or** on results already collected.
|
||||
Available agents:
|
||||
- system_info_worker: gather system metrics
|
||||
- service_inventory_worker: list running services
|
||||
- mariadb_analyzer: analyse MariaDB
|
||||
...
|
||||
Always start with `system_info_worker` and `service_inventory_worker` before drilling into a specific service.
|
||||
"""
|
||||
```
|
||||
|
||||
### Agent Prompts (agents/*.py)
|
||||
Each agent has its own specialized prompt, for example:
|
||||
|
||||
```python
|
||||
# system_info_worker prompt
|
||||
"""
|
||||
You are a Linux sysadmin. Use shell commands like `lscpu`, `free -h`, and `df -h` to gather CPU, RAM, and disk usage.
|
||||
Return a concise plain‑text summary. Only run safe, read‑only commands.
|
||||
"""
|
||||
```
|
||||
|
||||
## 🎯 What Each Agent Receives
|
||||
|
||||
When an agent is activated via transfer:
|
||||
- **Full conversation history**: All previous messages between user, supervisor, and other agents
|
||||
- **Specialized prompt**: Guides how the agent should interpret and act on the conversation
|
||||
- **Tools**: Shell access, specific analyzers, etc.
|
||||
- **Context**: Results from previous agents in the conversation
|
||||
|
||||
## 🔄 How Agent Results Flow Back to Supervisor
|
||||
|
||||
**This is the key mechanism that makes the multi-agent system intelligent:**
|
||||
|
||||
1. **Agent produces results**: Each agent generates an `AIMessage` with its findings/analysis
|
||||
2. **Results become part of conversation**: The agent's response is added to the shared message history
|
||||
3. **Supervisor sees everything**: When control returns to supervisor, it has access to:
|
||||
- Original user query
|
||||
- All previous agent responses
|
||||
- Tool execution results
|
||||
- Complete conversation context
|
||||
|
||||
4. **Supervisor strategy updates**: Based on accumulated knowledge, supervisor can:
|
||||
- Decide which agent to call next
|
||||
- Skip unnecessary agents if enough info is gathered
|
||||
- Synthesize results from multiple agents
|
||||
- Provide final comprehensive response
|
||||
|
||||
### Example Flow:
|
||||
```
|
||||
User: "Nginx 502 error, help!"
|
||||
├── Supervisor → system_info_worker
|
||||
│ └── Returns: "502 usually means upstream server issues, check logs..."
|
||||
├── Supervisor (now knows about upstream issues) → service_inventory_worker
|
||||
│ └── Returns: "Check PHP-FPM status, verify upstream config..."
|
||||
└── Supervisor (has both perspectives) → Final synthesis
|
||||
└── "Based on system analysis and service inventory, here's comprehensive solution..."
|
||||
```
|
||||
|
||||
## 🔍 Enhanced Debugging
|
||||
|
||||
The updated `utils.py` now shows:
|
||||
- **Transfer explanations**: What each "Successfully transferred" means
|
||||
- **Conversation context**: Last few messages to understand the flow
|
||||
- **Tool call details**: What tools are being used and why
|
||||
- **Agent delegation**: Which agent is being called and for what purpose
|
||||
|
||||
## 🔍 Observing Result Flow in Practice
|
||||
|
||||
To see how results flow back to the supervisor, run the enhanced debugging and watch for:
|
||||
|
||||
1. **Agent Results**: Look for `AIMessage` from agents (not just transfer confirmations)
|
||||
2. **Conversation Context**: The expanding message history in each step
|
||||
3. **Supervisor Decision Changes**: How supervisor's next choice is influenced by results
|
||||
|
||||
### Example Debug Output Analysis:
|
||||
```
|
||||
🔄 STEP 2: system_info_worker
|
||||
💬 MESSAGE TYPE: AIMessage ← AGENT'S ACTUAL RESULT
|
||||
📄 CONTENT: "502 typically indicates upstream server issues..."
|
||||
|
||||
🔄 STEP 4: service_inventory_worker
|
||||
💬 MESSAGE TYPE: AIMessage ← AGENT'S ACTUAL RESULT
|
||||
📄 CONTENT: "Check PHP-FPM status, verify upstream config..."
|
||||
|
||||
🔄 STEP 5: supervisor
|
||||
💬 MESSAGE TYPE: AIMessage ← SUPERVISOR'S SYNTHESIS
|
||||
📄 CONTENT: "Based on system analysis and service inventory..."
|
||||
📚 CONVERSATION CONTEXT (12 messages) ← SUPERVISOR SEES ALL RESULTS
|
||||
```
|
||||
|
||||
The supervisor's final response demonstrates it has processed and synthesized results from both agents!
|
||||
|
||||
## 📋 Key Takeaways
|
||||
|
||||
- **"Successfully transferred"** = Control handoff confirmation, not data transfer
|
||||
- **Each agent** gets the full conversation context INCLUDING previous agent results
|
||||
- **Agent prompts** determine how they process that context
|
||||
- **Supervisor** orchestrates the workflow based on its prompt strategy
|
||||
- **The conversation** builds up context as each agent contributes their expertise
|
||||
- **Results accumulate**: Each agent can see and build upon previous agents' work
|
||||
- **Supervisor learns**: Strategy updates based on what agents discover
|
||||
- **Dynamic workflow**: Supervisor can skip agents or change direction based on results
|
33
multi-agent-supervisor/agents/__init__.py
Normal file
33
multi-agent-supervisor/agents/__init__.py
Normal file
@@ -0,0 +1,33 @@
|
||||
"""Agent definitions for the multi-agent sysadmin system."""
|
||||
|
||||
from .system_agents import (
|
||||
create_system_info_worker,
|
||||
create_service_inventory_worker,
|
||||
)
|
||||
from .service_agents import (
|
||||
create_mariadb_worker,
|
||||
create_nginx_worker,
|
||||
create_phpfpm_worker,
|
||||
)
|
||||
from .network_agents import (
|
||||
create_network_worker,
|
||||
create_cert_worker,
|
||||
)
|
||||
from .analysis_agents import (
|
||||
create_risk_worker,
|
||||
create_remediation_worker,
|
||||
create_harmonizer_worker,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
"create_system_info_worker",
|
||||
"create_service_inventory_worker",
|
||||
"create_mariadb_worker",
|
||||
"create_nginx_worker",
|
||||
"create_phpfpm_worker",
|
||||
"create_network_worker",
|
||||
"create_cert_worker",
|
||||
"create_risk_worker",
|
||||
"create_remediation_worker",
|
||||
"create_harmonizer_worker",
|
||||
]
|
42
multi-agent-supervisor/agents/analysis_agents.py
Normal file
42
multi-agent-supervisor/agents/analysis_agents.py
Normal file
@@ -0,0 +1,42 @@
|
||||
"""Analysis and remediation agents."""
|
||||
|
||||
from langgraph.prebuilt import create_react_agent
|
||||
from custom_tools import get_shell_tool
|
||||
|
||||
|
||||
def create_risk_worker():
|
||||
"""Create risk assessment agent."""
|
||||
return create_react_agent(
|
||||
model="openai:gpt-4o-mini",
|
||||
tools=[], # pure‑LLM reasoning
|
||||
prompt="""
|
||||
Aggregate the findings from other agents and assign a severity: Critical, High, Medium, or Low.
|
||||
Output a short report.
|
||||
""",
|
||||
name="risk_scorer"
|
||||
)
|
||||
|
||||
|
||||
def create_remediation_worker():
|
||||
"""Create remediation agent."""
|
||||
return create_react_agent(
|
||||
model="openai:gpt-4o-mini",
|
||||
tools=[get_shell_tool()],
|
||||
prompt="""
|
||||
Propose safe bash commands or configuration edits to fix detected issues.
|
||||
NEVER run destructive commands automatically; always request confirmation.
|
||||
""",
|
||||
name="remediation_worker"
|
||||
)
|
||||
|
||||
|
||||
def create_harmonizer_worker():
|
||||
"""Create system hardening agent."""
|
||||
return create_react_agent(
|
||||
model="openai:gpt-4o-mini",
|
||||
tools=[get_shell_tool()],
|
||||
prompt="""
|
||||
Apply best‑practice hardening (`ulimit`, `sysctl`, journald rotation) in dry‑run mode unless severity is High.
|
||||
""",
|
||||
name="harmonizer_worker"
|
||||
)
|
29
multi-agent-supervisor/agents/network_agents.py
Normal file
29
multi-agent-supervisor/agents/network_agents.py
Normal file
@@ -0,0 +1,29 @@
|
||||
"""Network and security monitoring agents."""
|
||||
|
||||
from langgraph.prebuilt import create_react_agent
|
||||
from custom_tools import get_shell_tool
|
||||
|
||||
|
||||
def create_network_worker():
|
||||
"""Create network diagnostics agent."""
|
||||
return create_react_agent(
|
||||
model="openai:gpt-4o-mini",
|
||||
tools=[get_shell_tool()],
|
||||
prompt="""
|
||||
Diagnose network issues using `ping`, `traceroute`, and `dig`.
|
||||
""",
|
||||
name="network_diag"
|
||||
)
|
||||
|
||||
|
||||
def create_cert_worker():
|
||||
"""Create certificate checking agent."""
|
||||
return create_react_agent(
|
||||
model="openai:gpt-4o-mini",
|
||||
tools=[get_shell_tool()],
|
||||
prompt="""
|
||||
Check TLS certificates on disk with `openssl x509 -noout -enddate -in <cert>`.
|
||||
Raise an alert when a certificate expires in fewer than 30 days.
|
||||
""",
|
||||
name="cert_checker"
|
||||
)
|
42
multi-agent-supervisor/agents/service_agents.py
Normal file
42
multi-agent-supervisor/agents/service_agents.py
Normal file
@@ -0,0 +1,42 @@
|
||||
"""Service-specific monitoring agents."""
|
||||
|
||||
from langgraph.prebuilt import create_react_agent
|
||||
from custom_tools import get_shell_tool, LogTailTool
|
||||
|
||||
|
||||
def create_mariadb_worker():
|
||||
"""Create MariaDB analysis agent."""
|
||||
return create_react_agent(
|
||||
model="openai:gpt-4o-mini",
|
||||
tools=[get_shell_tool(), LogTailTool()],
|
||||
prompt="""
|
||||
You are a MariaDB expert. Check config files in /etc/mysql and inspect `/var/log/mysql/*.log` for errors.
|
||||
Use `mysqladmin status` and other read‑only commands. Use the `tail_log` tool for logs.
|
||||
""",
|
||||
name="mariadb_analyzer"
|
||||
)
|
||||
|
||||
|
||||
def create_nginx_worker():
|
||||
"""Create Nginx analysis agent."""
|
||||
return create_react_agent(
|
||||
model="openai:gpt-4o-mini",
|
||||
tools=[get_shell_tool(), LogTailTool()],
|
||||
prompt="""
|
||||
You are an Nginx expert. Validate configuration with `nginx -t` and inspect access/error logs.
|
||||
Use the `tail_log` tool for `/var/log/nginx/error.log`.
|
||||
""",
|
||||
name="nginx_analyzer"
|
||||
)
|
||||
|
||||
|
||||
def create_phpfpm_worker():
|
||||
"""Create PHP-FPM analysis agent."""
|
||||
return create_react_agent(
|
||||
model="openai:gpt-4o-mini",
|
||||
tools=[get_shell_tool(), LogTailTool()],
|
||||
prompt="""
|
||||
You are a PHP‑FPM expert. Check `systemctl status php*-fpm` and look for memory leaks or timeouts in the logs.
|
||||
""",
|
||||
name="phpfpm_analyzer"
|
||||
)
|
30
multi-agent-supervisor/agents/system_agents.py
Normal file
30
multi-agent-supervisor/agents/system_agents.py
Normal file
@@ -0,0 +1,30 @@
|
||||
"""System monitoring agents."""
|
||||
|
||||
from langgraph.prebuilt import create_react_agent
|
||||
from custom_tools import get_shell_tool
|
||||
|
||||
|
||||
def create_system_info_worker():
|
||||
"""Create system information gathering agent."""
|
||||
return create_react_agent(
|
||||
model="openai:gpt-4o-mini",
|
||||
tools=[get_shell_tool()],
|
||||
prompt="""
|
||||
You are a Linux sysadmin. Use shell commands like `lscpu`, `free -h`, and `df -h` to gather CPU, RAM, and disk usage.
|
||||
Return a concise plain‑text summary. Only run safe, read‑only commands.
|
||||
""",
|
||||
name="system_info_worker"
|
||||
)
|
||||
|
||||
|
||||
def create_service_inventory_worker():
|
||||
"""Create service inventory agent."""
|
||||
return create_react_agent(
|
||||
model="openai:gpt-4o-mini",
|
||||
tools=[get_shell_tool()],
|
||||
prompt="""
|
||||
List all running services using `systemctl list-units --type=service --state=running`.
|
||||
Return a JSON array of service names.
|
||||
""",
|
||||
name="service_inventory_worker"
|
||||
)
|
26
multi-agent-supervisor/config.py
Normal file
26
multi-agent-supervisor/config.py
Normal file
@@ -0,0 +1,26 @@
|
||||
"""Configuration settings for the multi-agent system."""
|
||||
|
||||
from langchain_openai import ChatOpenAI
|
||||
|
||||
|
||||
def get_base_model():
|
||||
"""Get the base LLM model configuration."""
|
||||
return ChatOpenAI(model="gpt-4o-mini", temperature=0)
|
||||
|
||||
|
||||
SUPERVISOR_PROMPT = """
|
||||
You are the supervisor of a team of specialised sysadmin agents.
|
||||
Decide which agent to delegate to based on the user's query **or** on results already collected.
|
||||
Available agents:
|
||||
- system_info_worker: gather system metrics
|
||||
- service_inventory_worker: list running services
|
||||
- mariadb_analyzer: analyse MariaDB
|
||||
- nginx_analyzer: analyse Nginx
|
||||
- phpfpm_analyzer: analyse PHP‑FPM
|
||||
- network_diag: diagnose network issues
|
||||
- cert_checker: check TLS certificates
|
||||
- risk_scorer: aggregate severity
|
||||
- remediation_worker: propose fixes
|
||||
- harmonizer_worker: apply hardening
|
||||
Always start with `system_info_worker` and `service_inventory_worker` before drilling into a specific service.
|
||||
"""
|
6
multi-agent-supervisor/custom_tools/__init__.py
Normal file
6
multi-agent-supervisor/custom_tools/__init__.py
Normal file
@@ -0,0 +1,6 @@
|
||||
"""Custom tools for the multi-agent sysadmin system."""
|
||||
|
||||
from .log_tail_tool import LogTailTool
|
||||
from .shell_tool_wrapper import get_shell_tool
|
||||
|
||||
__all__ = ["LogTailTool", "get_shell_tool"]
|
24
multi-agent-supervisor/custom_tools/log_tail_tool.py
Normal file
24
multi-agent-supervisor/custom_tools/log_tail_tool.py
Normal file
@@ -0,0 +1,24 @@
|
||||
"""Log tail tool for reading log files."""
|
||||
|
||||
import subprocess
|
||||
from langchain_core.tools import BaseTool
|
||||
|
||||
|
||||
class LogTailTool(BaseTool):
|
||||
"""Tail the last N lines from a log file."""
|
||||
|
||||
name: str = "tail_log"
|
||||
description: str = "Tail the last N lines of a log file given its path and optional number of lines."
|
||||
|
||||
def _run(self, path: str, lines: int = 500): # type: ignore[override]
|
||||
"""Run the tool to tail log files."""
|
||||
try:
|
||||
return subprocess.check_output(["tail", "-n", str(lines), path], text=True)
|
||||
except subprocess.CalledProcessError as e:
|
||||
return f"Error reading log file {path}: {e}"
|
||||
except FileNotFoundError:
|
||||
return f"Log file not found: {path}"
|
||||
|
||||
async def _arun(self, *args, **kwargs): # noqa: D401
|
||||
"""Async version not implemented."""
|
||||
raise NotImplementedError("Use the synchronous version of this tool.")
|
@@ -0,0 +1,8 @@
|
||||
"""Shell tool wrapper for consistent access."""
|
||||
|
||||
from langchain_community.tools import ShellTool
|
||||
|
||||
|
||||
def get_shell_tool() -> ShellTool:
|
||||
"""Get a configured shell tool instance."""
|
||||
return ShellTool()
|
0
multi-agent-supervisor/examples.py
Normal file
0
multi-agent-supervisor/examples.py
Normal file
1
multi-agent-supervisor/loghub
Symbolic link
1
multi-agent-supervisor/loghub
Symbolic link
@@ -0,0 +1 @@
|
||||
../loghub
|
68
multi-agent-supervisor/main-multi-agent.py
Normal file
68
multi-agent-supervisor/main-multi-agent.py
Normal file
@@ -0,0 +1,68 @@
|
||||
# Multi-agent sysadmin assistant using LangChain + LangGraph Supervisor
|
||||
# Requires: `pip install langchain-openai langgraph langgraph-supervisor`
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from supervisor import create_sysadmin_supervisor
|
||||
from utils import print_step_info, explain_supervisor_pattern
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Create the supervisor
|
||||
supervisor = create_sysadmin_supervisor()
|
||||
|
||||
# Example run - demonstrating both invoke and streaming with debug output
|
||||
query = {
|
||||
"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Nginx returns 502 Bad Gateway on my server. What can I do?",
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
print("🚀 Starting multi-agent sysadmin analysis...")
|
||||
print(f"📝 User Query: {query['messages'][0]['content']}")
|
||||
print("=" * 80)
|
||||
|
||||
# Show explanation of the supervisor pattern
|
||||
explain_supervisor_pattern()
|
||||
|
||||
print("\n=== Using invoke() method ===")
|
||||
result = supervisor.invoke(query)
|
||||
|
||||
print("\n📊 FINAL RESULT:")
|
||||
print("-" * 40)
|
||||
print(result["messages"][-1].content)
|
||||
print("-" * 40)
|
||||
|
||||
print(f"\n📈 Total messages exchanged: {len(result['messages'])}")
|
||||
|
||||
print("\n=== Using stream() method for detailed step-by-step analysis ===")
|
||||
step_count = 0
|
||||
max_steps = 20 # Prevent infinite loops
|
||||
|
||||
try:
|
||||
chunks_processed = []
|
||||
for chunk in supervisor.stream(query):
|
||||
step_count += 1
|
||||
chunks_processed.append(chunk)
|
||||
print_step_info(step_count, chunk)
|
||||
|
||||
# Safety check to prevent infinite loops
|
||||
if step_count >= max_steps:
|
||||
print(f"\n⚠️ Reached maximum steps ({max_steps}), stopping stream...")
|
||||
break
|
||||
|
||||
print(f"\n✅ Streaming completed successfully with {step_count} steps")
|
||||
print(f"📊 Total chunks processed: {len(chunks_processed)}")
|
||||
|
||||
# Check if the last chunk contains a complete final response
|
||||
if chunks_processed:
|
||||
last_chunk = chunks_processed[-1]
|
||||
print(f"🔍 Last chunk keys: {list(last_chunk.keys()) if isinstance(last_chunk, dict) else type(last_chunk)}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"\n❌ Streaming error after {step_count} steps: {e}")
|
||||
print("💡 The invoke() method worked fine, so the supervisor itself is functional.")
|
||||
import traceback
|
||||
traceback.print_exc()
|
37
multi-agent-supervisor/supervisor.py
Normal file
37
multi-agent-supervisor/supervisor.py
Normal file
@@ -0,0 +1,37 @@
|
||||
"""Multi-agent supervisor for sysadmin tasks."""
|
||||
|
||||
from langchain_openai import ChatOpenAI
|
||||
from langgraph_supervisor import create_supervisor
|
||||
|
||||
from agents.system_agents import create_system_info_worker, create_service_inventory_worker
|
||||
from agents.service_agents import create_mariadb_worker, create_nginx_worker, create_phpfpm_worker
|
||||
from agents.network_agents import create_network_worker, create_cert_worker
|
||||
from agents.analysis_agents import create_risk_worker, create_remediation_worker, create_harmonizer_worker
|
||||
from config import get_base_model, SUPERVISOR_PROMPT
|
||||
|
||||
|
||||
def create_sysadmin_supervisor():
|
||||
"""Create a supervisor that coordinates sysadmin agents."""
|
||||
|
||||
# Create all the specialized agents
|
||||
agents = [
|
||||
create_system_info_worker(),
|
||||
create_service_inventory_worker(),
|
||||
create_mariadb_worker(),
|
||||
create_nginx_worker(),
|
||||
create_phpfpm_worker(),
|
||||
create_network_worker(),
|
||||
create_cert_worker(),
|
||||
create_risk_worker(),
|
||||
create_remediation_worker(),
|
||||
create_harmonizer_worker(),
|
||||
]
|
||||
|
||||
# Create and return the supervisor
|
||||
supervisor = create_supervisor(
|
||||
agents=agents,
|
||||
model=get_base_model(),
|
||||
prompt=SUPERVISOR_PROMPT
|
||||
)
|
||||
|
||||
return supervisor.compile()
|
142
multi-agent-supervisor/utils.py
Normal file
142
multi-agent-supervisor/utils.py
Normal file
@@ -0,0 +1,142 @@
|
||||
"""Utility functions for the multi-agent system."""
|
||||
|
||||
|
||||
def explain_supervisor_pattern():
|
||||
"""Explain how the LangGraph supervisor pattern works."""
|
||||
print("🏗️ MULTI-AGENT SUPERVISOR PATTERN EXPLANATION:")
|
||||
print("=" * 60)
|
||||
print("1. 🎯 SUPERVISOR: Receives user query and decides which agent to delegate to")
|
||||
print("2. 🔄 TRANSFER: Uses transfer tools (e.g., transfer_to_system_info_worker)")
|
||||
print("3. 🤖 AGENT: Specialized agent executes its task with its own prompt/tools")
|
||||
print("4. 🔙 RETURN: Agent uses transfer_back_to_supervisor when done")
|
||||
print("5. 🧠 DECISION: Supervisor analyzes results and decides next agent or final response")
|
||||
print()
|
||||
print("📋 WHAT 'Successfully transferred' MEANS:")
|
||||
print(" - It's the response from a transfer tool call")
|
||||
print(" - Indicates control handoff between supervisor and agent")
|
||||
print(" - Each agent gets the full conversation context")
|
||||
print(" - Agent's prompt guides how it processes that context")
|
||||
print()
|
||||
print("🔍 SUPERVISOR PROMPT (from config.py):")
|
||||
print(" - Defines available agents and their specialties")
|
||||
print(" - Guides delegation strategy (start with system_info & service_inventory)")
|
||||
print(" - Agent prompts are in agents/*.py files")
|
||||
print("=" * 60)
|
||||
print()
|
||||
|
||||
|
||||
def print_step_info(step_count: int, chunk):
|
||||
"""Print formatted step information during streaming."""
|
||||
print(f"\n🔄 STEP {step_count}:")
|
||||
print("-" * 30)
|
||||
|
||||
try:
|
||||
# Extract agent information from chunk
|
||||
if isinstance(chunk, dict):
|
||||
# Look for agent names in the chunk keys
|
||||
agent_names = [key for key in chunk.keys() if key in [
|
||||
'system_info_worker', 'service_inventory_worker', 'mariadb_analyzer',
|
||||
'nginx_analyzer', 'phpfpm_analyzer', 'network_diag', 'cert_checker',
|
||||
'risk_scorer', 'remediation_worker', 'harmonizer_worker', 'supervisor'
|
||||
]]
|
||||
|
||||
if agent_names:
|
||||
current_agent = agent_names[0]
|
||||
print(f"🤖 ACTIVE AGENT: {current_agent}")
|
||||
|
||||
# Show the messages from this agent
|
||||
agent_data = chunk[current_agent]
|
||||
if 'messages' in agent_data:
|
||||
messages = agent_data['messages']
|
||||
if messages:
|
||||
last_message = messages[-1]
|
||||
# Get message type from the class name
|
||||
message_type = type(last_message).__name__
|
||||
print(f"💬 MESSAGE TYPE: {message_type}")
|
||||
|
||||
# Show content preview if available
|
||||
if hasattr(last_message, 'content') and last_message.content:
|
||||
content = last_message.content
|
||||
content_length = len(content)
|
||||
print(f"📏 CONTENT LENGTH: {content_length} characters")
|
||||
|
||||
# Show full content for final AI responses, abbreviated for others
|
||||
if message_type == 'AIMessage':
|
||||
print(f"📄 FULL CONTENT:")
|
||||
print(content)
|
||||
print() # Extra line for readability
|
||||
else:
|
||||
# Truncate other message types for brevity
|
||||
preview = content[:200] + "..." if len(content) > 200 else content
|
||||
print(f"📄 CONTENT PREVIEW:")
|
||||
print(preview)
|
||||
print() # Extra line for readability
|
||||
|
||||
# Show tool calls if any
|
||||
if hasattr(last_message, 'tool_calls') and last_message.tool_calls:
|
||||
tool_calls = last_message.tool_calls
|
||||
print(f"🔧 TOOL CALLS: {len(tool_calls)} tool(s)")
|
||||
for i, tool_call in enumerate(tool_calls):
|
||||
tool_name = getattr(tool_call, 'name', 'unknown')
|
||||
print(f" {i+1}. {tool_name}")
|
||||
# Show transfer details for supervisor delegation
|
||||
if tool_name.startswith('transfer_to_'):
|
||||
target_agent = tool_name.replace('transfer_to_', '')
|
||||
print(f" 🎯 DELEGATING to: {target_agent}")
|
||||
# Show the arguments/context being passed
|
||||
if hasattr(tool_call, 'args') and tool_call.args:
|
||||
print(f" 📋 Context/Args: {tool_call.args}")
|
||||
|
||||
# Show additional info for ToolMessage
|
||||
if message_type == 'ToolMessage':
|
||||
if hasattr(last_message, 'name'):
|
||||
tool_name = last_message.name
|
||||
print(f"🔧 TOOL NAME: {tool_name}")
|
||||
|
||||
# Explain what "Successfully transferred" means
|
||||
if "transfer" in tool_name and "Successfully transferred" in content:
|
||||
if tool_name.startswith('transfer_to_'):
|
||||
target_agent = tool_name.replace('transfer_to_', '')
|
||||
print(f" ℹ️ EXPLANATION: Supervisor delegated control to {target_agent}")
|
||||
print(f" ℹ️ The {target_agent} will now execute its specialized tasks")
|
||||
elif tool_name == 'transfer_back_to_supervisor':
|
||||
print(f" ℹ️ EXPLANATION: Agent completed its task and returned control to supervisor")
|
||||
print(f" ℹ️ Supervisor will decide the next step based on results")
|
||||
|
||||
if hasattr(last_message, 'tool_call_id'):
|
||||
print(f"🔧 TOOL CALL ID: {last_message.tool_call_id}")
|
||||
|
||||
# Show conversation context for better understanding
|
||||
agent_data = chunk[current_agent]
|
||||
if 'messages' in agent_data and len(agent_data['messages']) > 1:
|
||||
print(f"\n📚 CONVERSATION CONTEXT ({len(agent_data['messages'])} messages):")
|
||||
for i, msg in enumerate(agent_data['messages'][-3:], start=max(0, len(agent_data['messages'])-3)):
|
||||
msg_type = type(msg).__name__
|
||||
if hasattr(msg, 'content') and msg.content:
|
||||
preview = msg.content[:100].replace('\n', ' ')
|
||||
if len(msg.content) > 100:
|
||||
preview += "..."
|
||||
print(f" {i+1}. {msg_type}: {preview}")
|
||||
elif hasattr(msg, 'tool_calls') and msg.tool_calls:
|
||||
tool_names = [getattr(tc, 'name', 'unknown') for tc in msg.tool_calls]
|
||||
print(f" {i+1}. {msg_type}: Tool calls: {tool_names}")
|
||||
else:
|
||||
print(f" {i+1}. {msg_type}: (no content)")
|
||||
|
||||
print() # Extra spacing for readability
|
||||
else:
|
||||
print("📋 CHUNK DATA:")
|
||||
# Show first few keys for debugging
|
||||
chunk_keys = list(chunk.keys())[:3]
|
||||
print(f" Keys: {chunk_keys}")
|
||||
else:
|
||||
print(f"📦 CHUNK TYPE: {type(chunk)}")
|
||||
print(f"📄 CONTENT: {str(chunk)[:100]}...")
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error processing chunk: {e}")
|
||||
print(f"📦 CHUNK TYPE: {type(chunk)}")
|
||||
if hasattr(chunk, '__dict__'):
|
||||
print(f"📄 CHUNK ATTRIBUTES: {list(chunk.__dict__.keys())}")
|
||||
|
||||
print("-" * 30)
|
Reference in New Issue
Block a user