implement 2 strategies

2025-06-26 14:52:36 +02:00
parent 90ac5e9e82
commit 331e2e434d
23 changed files with 1080 additions and 747 deletions
--- a/multi-agent-supervisor/README-modular.md
+++ b/multi-agent-supervisor/README-modular.md
@@ -0,0 +1,90 @@
+# Multi-Agent Sysadmin Assistant
+
+A modular multi-agent system for system administration tasks using LangChain and LangGraph.
+
+## Architecture
+
+The system is organized into several modules for better maintainability:
+
+### 📁 Project Structure
+
+```
+multi-agent-supervisor/
+├── main-multi-agent.py      # Main entry point
+├── config.py                # Configuration and settings
+├── supervisor.py            # Supervisor orchestration
+├── utils.py                 # Utility functions
+├── requirements.txt         # Dependencies
+├── custom_tools/            # Custom tool implementations
+│   ├── __init__.py
+│   ├── log_tail_tool.py     # Log reading tool
+│   └── shell_tool_wrapper.py # Shell tool wrapper
+└── agents/                  # Agent definitions
+    ├── __init__.py
+    ├── system_agents.py     # System monitoring agents
+    ├── service_agents.py    # Service-specific agents
+    ├── network_agents.py    # Network and security agents
+    └── analysis_agents.py   # Analysis and remediation agents
+```
+
+## Agents
+
+### System Agents
+- **System Info Worker**: Gathers CPU, RAM, and disk usage
+- **Service Inventory Worker**: Lists running services
+
+### Service Agents  
+- **MariaDB Analyzer**: Checks MariaDB configuration and logs
+- **Nginx Analyzer**: Validates Nginx configuration and logs
+- **PHP-FPM Analyzer**: Monitors PHP-FPM status and performance
+
+### Network Agents
+- **Network Diagnostics**: Uses ping, traceroute, and dig
+- **Certificate Checker**: Monitors TLS certificate expiration
+
+### Analysis Agents
+- **Risk Scorer**: Aggregates findings and assigns severity levels
+- **Remediation Worker**: Proposes safe fixes for issues
+- **Harmonizer Worker**: Applies system hardening best practices
+
+## Benefits of Modular Architecture
+
+1. **Separation of Concerns**: Each module has a single responsibility
+2. **Reusability**: Tools and agents can be easily reused across projects
+3. **Maintainability**: Easy to update individual components
+4. **Testability**: Each module can be tested independently
+5. **Scalability**: Easy to add new agents or tools
+6. **Code Organization**: Clear structure makes navigation easier
+
+## Usage
+
+```python
+from supervisor import create_sysadmin_supervisor
+
+# Create supervisor with all agents
+supervisor = create_sysadmin_supervisor()
+
+# Run analysis
+query = {
+    "messages": [
+        {
+            "role": "user", 
+            "content": "Check if my web server is running properly"
+        }
+    ]
+}
+
+result = supervisor.invoke(query)
+```
+
+## Adding New Agents
+
+1. Create agent function in appropriate module under `agents/`
+2. Import and add to supervisor in `supervisor.py`
+3. Update supervisor prompt in `config.py`
+
+## Adding New Tools
+
+1. Create tool class in `custom_tools/`
+2. Export from `custom_tools/__init__.py`
+3. Import and use in agent definitions
--- a/multi-agent-supervisor/README.md
+++ b/multi-agent-supervisor/README.md
@@ -0,0 +1,185 @@
+# Multi-Agent Supervisor System for Sysadmin Tasks
+
+This directory contains a sophisticated multi-agent system with a supervisor pattern for comprehensive system administration and troubleshooting.
+
+## Overview
+
+The multi-agent supervisor system uses multiple specialized agents coordinated by a supervisor to handle complex sysadmin tasks:
+
+1. **Supervisor Agent**: Orchestrates and delegates tasks to specialized workers
+2. **Specialized Workers**: Each agent is an expert in a specific domain
+3. **Parallel Processing**: Multiple agents can work simultaneously
+4. **Intelligent Routing**: Tasks are routed to the most appropriate specialist
+
+## Architecture
+
+```
+User Input → Supervisor → Specialized Agents → Aggregated Response
+                ↓
+    ┌─────────────────────────────────────────────────┐
+    │  system_info │ nginx │ mariadb │ network │ ...  │
+    └─────────────────────────────────────────────────┘
+```
+
+## Specialized Agents
+
+### Core System Agents
+- **`system_info_worker`**: CPU, RAM, disk usage monitoring
+- **`service_inventory_worker`**: Lists running services
+
+### Service-Specific Agents
+- **`mariadb_analyzer`**: MariaDB configuration and log analysis
+- **`nginx_analyzer`**: Nginx configuration validation and log analysis
+- **`phpfpm_analyzer`**: PHP-FPM performance and error analysis
+
+### Network & Security Agents
+- **`network_diag`**: Network connectivity and DNS diagnostics
+- **`cert_checker`**: TLS certificate validation and expiry alerts
+
+### Analysis & Action Agents
+- **`risk_scorer`**: Aggregates findings and assigns severity levels
+- **`remediation_worker`**: Proposes safe fixes for detected issues
+- **`harmonizer_worker`**: Applies security hardening best practices
+
+## Features
+
+### Advanced Capabilities
+- **Intelligent Delegation**: Supervisor routes tasks to appropriate specialists
+- **Parallel Execution**: Multiple agents can work simultaneously
+- **Severity Assessment**: Risk scoring with Critical/High/Medium/Low levels
+- **Safe Remediation**: Proposes fixes with confirmation requests
+- **Security Hardening**: Automated best-practice application
+
+### Execution Modes
+- **Invoke Mode**: Complete analysis with final result
+- **Stream Mode**: Real-time step-by-step execution visibility
+
+## Files
+
+- `main-multi-agent.py`: Complete multi-agent supervisor implementation
+- `loghub/`: Symbolic link to log files directory
+
+## Usage
+
+```bash
+cd multi-agent-supervisor
+python main-multi-agent.py
+```
+
+The script includes both execution modes:
+
+### 1. Invoke Mode (Complete Analysis)
+```python
+result = supervisor.invoke(query)
+print(result["messages"][-1]["content"])
+```
+
+### 2. Stream Mode (Step-by-Step)
+```python
+for chunk in supervisor.stream(query):
+    # Real-time agent execution monitoring
+    print(f"🤖 ACTIVE AGENT: {current_agent}")
+    print(f"🔧 TOOL CALLS: {len(tool_calls)} tool(s)")
+```
+
+## Example Workflow
+
+For the query: *"Nginx returns 502 Bad Gateway on my server. What can I do?"*
+
+1. **Supervisor** analyzes the request
+2. **system_info_worker** checks system resources
+3. **service_inventory_worker** lists running services
+4. **nginx_analyzer** validates Nginx configuration and checks logs
+5. **phpfpm_analyzer** checks PHP-FPM status (common 502 cause)
+6. **risk_scorer** assesses the severity
+7. **remediation_worker** proposes specific fixes
+
+## Pros and Cons
+
+### ✅ Pros
+- **Domain Expertise**: Each agent specializes in specific areas
+- **Parallel Processing**: Multiple agents work simultaneously
+- **Comprehensive Analysis**: Systematic approach to complex problems
+- **Risk Assessment**: Built-in severity scoring
+- **Intelligent Routing**: Tasks go to the right specialist
+- **Scalable**: Easy to add new specialized agents
+
+### ❌ Cons
+- **Complexity**: More sophisticated setup and debugging
+- **Resource Intensive**: Higher computational overhead
+- **Coordination Overhead**: Supervisor management complexity
+- **Potential Over-engineering**: May be overkill for simple tasks
+
+## When to Use
+
+Choose the multi-agent supervisor when:
+- You need comprehensive system analysis
+- Multiple services/components are involved
+- You want parallel processing capabilities
+- Risk assessment and severity scoring are important
+- You're dealing with complex, multi-faceted problems
+- You need specialized domain expertise
+
+## Agent Interaction Flow
+
+```mermaid
+graph TD
+    A[User Query] --> B[Supervisor]
+    B --> C[system_info_worker]
+    B --> D[service_inventory_worker]
+    B --> E[Service Specialists]
+    E --> F[nginx_analyzer]
+    E --> G[mariadb_analyzer]
+    E --> H[phpfpm_analyzer]
+    C --> I[risk_scorer]
+    D --> I
+    F --> I
+    G --> I
+    H --> I
+    I --> J[remediation_worker]
+    J --> K[Final Response]
+```
+
+## Customization
+
+### Adding New Agents
+```python
+new_agent = create_react_agent(
+    model="openai:gpt-4o-mini",
+    tools=[shell_tool, custom_tools],
+    prompt="Your specialized agent prompt...",
+    name="new_specialist"
+)
+
+# Add to supervisor
+supervisor = create_supervisor(
+    agents=[...existing_agents, new_agent],
+    model=model,
+    prompt=updated_supervisor_prompt
+)
+```
+
+### Custom Tools
+```python
+class CustomTool(BaseTool):
+    name = "custom_tool"
+    description = "Tool description"
+    
+    def _run(self, **kwargs):
+        # Tool implementation
+        return result
+```
+
+## Requirements
+
+```bash
+pip install langchain-openai langgraph langgraph-supervisor langchain-community
+export OPENAI_API_KEY="your-api-key"
+```
+
+## Performance Considerations
+
+- **Token Usage**: Higher due to multiple agent interactions
+- **Execution Time**: May be longer due to coordination overhead
+- **Memory**: Higher memory usage with multiple concurrent agents
+- **Rate Limits**: Monitor API rate limits with parallel requests
--- a/multi-agent-supervisor/UNDERSTANDING_TRANSFERS.md
+++ b/multi-agent-supervisor/UNDERSTANDING_TRANSFERS.md
@@ -0,0 +1,143 @@
+# Understanding Multi-Agent Transfers
+
+## What "Successfully transferred..." means
+
+When you see messages like:
+- `Successfully transferred to system_info_worker`
+- `Successfully transferred back to supervisor`
+
+These are **tool execution results** from the LangGraph supervisor pattern. Here's what's happening:
+
+## 🔄 The Transfer Flow
+
+1. **Supervisor receives user query**: "Nginx returns 502 Bad Gateway on my server. What can I do?"
+
+2. **Supervisor analyzes and delegates**: Based on the `SUPERVISOR_PROMPT` in `config.py`, it decides to start with `system_info_worker`
+
+3. **Transfer tool execution**: Supervisor calls `transfer_to_system_info_worker` tool
+   - **Result**: "Successfully transferred to system_info_worker"
+   - **Meaning**: Control is now handed to the system_info_worker agent
+
+4. **Agent executes**: The `system_info_worker` gets:
+   - Full conversation context (including the original user query)
+   - Its own specialized prompt from `agents/system_agents.py`
+   - Access to its tools (shell commands for system info)
+
+5. **Agent completes and returns**: Agent calls `transfer_back_to_supervisor`
+   - **Result**: "Successfully transferred back to supervisor"
+   - **Meaning**: Agent finished its task and returned control
+   - **Important**: Agent's results are now part of the conversation history
+
+6. **Supervisor decides next step**: Based on **accumulated results**, supervisor either:
+   - Delegates to another agent (e.g., `service_inventory_worker`)
+   - Provides final response to user
+   - **Key**: Supervisor can see ALL previous agent results when making decisions
+
+## 🧠 How Prompts Work
+
+### Supervisor Prompt (config.py)
+```python
+SUPERVISOR_PROMPT = """
+You are the supervisor of a team of specialised sysadmin agents.
+Decide which agent to delegate to based on the user's query **or** on results already collected.
+Available agents:
+- system_info_worker: gather system metrics
+- service_inventory_worker: list running services  
+- mariadb_analyzer: analyse MariaDB
+...
+Always start with `system_info_worker` and `service_inventory_worker` before drilling into a specific service.
+"""
+```
+
+### Agent Prompts (agents/*.py)
+Each agent has its own specialized prompt, for example:
+
+```python
+# system_info_worker prompt
+"""
+You are a Linux sysadmin. Use shell commands like `lscpu`, `free -h`, and `df -h` to gather CPU, RAM, and disk usage. 
+Return a concise plain‑text summary. Only run safe, read‑only commands.
+"""
+```
+
+## 🎯 What Each Agent Receives
+
+When an agent is activated via transfer:
+- **Full conversation history**: All previous messages between user, supervisor, and other agents
+- **Specialized prompt**: Guides how the agent should interpret and act on the conversation
+- **Tools**: Shell access, specific analyzers, etc.
+- **Context**: Results from previous agents in the conversation
+
+## 🔄 How Agent Results Flow Back to Supervisor
+
+**This is the key mechanism that makes the multi-agent system intelligent:**
+
+1. **Agent produces results**: Each agent generates an `AIMessage` with its findings/analysis
+2. **Results become part of conversation**: The agent's response is added to the shared message history
+3. **Supervisor sees everything**: When control returns to supervisor, it has access to:
+   - Original user query
+   - All previous agent responses
+   - Tool execution results
+   - Complete conversation context
+
+4. **Supervisor strategy updates**: Based on accumulated knowledge, supervisor can:
+   - Decide which agent to call next
+   - Skip unnecessary agents if enough info is gathered
+   - Synthesize results from multiple agents
+   - Provide final comprehensive response
+
+### Example Flow:
+```
+User: "Nginx 502 error, help!"
+├── Supervisor → system_info_worker
+│   └── Returns: "502 usually means upstream server issues, check logs..."
+├── Supervisor (now knows about upstream issues) → service_inventory_worker  
+│   └── Returns: "Check PHP-FPM status, verify upstream config..."
+└── Supervisor (has both perspectives) → Final synthesis
+    └── "Based on system analysis and service inventory, here's comprehensive solution..."
+```
+
+## 🔍 Enhanced Debugging
+
+The updated `utils.py` now shows:
+- **Transfer explanations**: What each "Successfully transferred" means
+- **Conversation context**: Last few messages to understand the flow
+- **Tool call details**: What tools are being used and why
+- **Agent delegation**: Which agent is being called and for what purpose
+
+## 🔍 Observing Result Flow in Practice
+
+To see how results flow back to the supervisor, run the enhanced debugging and watch for:
+
+1. **Agent Results**: Look for `AIMessage` from agents (not just transfer confirmations)
+2. **Conversation Context**: The expanding message history in each step
+3. **Supervisor Decision Changes**: How supervisor's next choice is influenced by results
+
+### Example Debug Output Analysis:
+```
+🔄 STEP 2: system_info_worker
+💬 MESSAGE TYPE: AIMessage  ← AGENT'S ACTUAL RESULT
+📄 CONTENT: "502 typically indicates upstream server issues..."
+
+🔄 STEP 4: service_inventory_worker  
+💬 MESSAGE TYPE: AIMessage  ← AGENT'S ACTUAL RESULT
+📄 CONTENT: "Check PHP-FPM status, verify upstream config..."
+
+🔄 STEP 5: supervisor
+💬 MESSAGE TYPE: AIMessage  ← SUPERVISOR'S SYNTHESIS
+📄 CONTENT: "Based on system analysis and service inventory..."
+📚 CONVERSATION CONTEXT (12 messages)  ← SUPERVISOR SEES ALL RESULTS
+```
+
+The supervisor's final response demonstrates it has processed and synthesized results from both agents!
+
+## 📋 Key Takeaways
+
+- **"Successfully transferred"** = Control handoff confirmation, not data transfer
+- **Each agent** gets the full conversation context INCLUDING previous agent results
+- **Agent prompts** determine how they process that context
+- **Supervisor** orchestrates the workflow based on its prompt strategy
+- **The conversation** builds up context as each agent contributes their expertise
+- **Results accumulate**: Each agent can see and build upon previous agents' work
+- **Supervisor learns**: Strategy updates based on what agents discover
+- **Dynamic workflow**: Supervisor can skip agents or change direction based on results
--- a/multi-agent-supervisor/agents/init.py
+++ b/multi-agent-supervisor/agents/init.py
@@ -0,0 +1,33 @@
+"""Agent definitions for the multi-agent sysadmin system."""
+
+from .system_agents import (
+    create_system_info_worker,
+    create_service_inventory_worker,
+)
+from .service_agents import (
+    create_mariadb_worker,
+    create_nginx_worker,
+    create_phpfpm_worker,
+)
+from .network_agents import (
+    create_network_worker,
+    create_cert_worker,
+)
+from .analysis_agents import (
+    create_risk_worker,
+    create_remediation_worker,
+    create_harmonizer_worker,
+)
+
+__all__ = [
+    "create_system_info_worker",
+    "create_service_inventory_worker", 
+    "create_mariadb_worker",
+    "create_nginx_worker",
+    "create_phpfpm_worker",
+    "create_network_worker",
+    "create_cert_worker",
+    "create_risk_worker",
+    "create_remediation_worker",
+    "create_harmonizer_worker",
+]
--- a/multi-agent-supervisor/agents/analysis_agents.py
+++ b/multi-agent-supervisor/agents/analysis_agents.py
@@ -0,0 +1,42 @@
+"""Analysis and remediation agents."""
+
+from langgraph.prebuilt import create_react_agent
+from custom_tools import get_shell_tool
+
+
+def create_risk_worker():
+    """Create risk assessment agent."""
+    return create_react_agent(
+        model="openai:gpt-4o-mini",
+        tools=[],  # pure‑LLM reasoning
+        prompt="""
+Aggregate the findings from other agents and assign a severity: Critical, High, Medium, or Low.
+Output a short report.
+""",
+        name="risk_scorer"
+    )
+
+
+def create_remediation_worker():
+    """Create remediation agent."""
+    return create_react_agent(
+        model="openai:gpt-4o-mini",
+        tools=[get_shell_tool()],
+        prompt="""
+Propose safe bash commands or configuration edits to fix detected issues.
+NEVER run destructive commands automatically; always request confirmation.
+""",
+        name="remediation_worker"
+    )
+
+
+def create_harmonizer_worker():
+    """Create system hardening agent."""
+    return create_react_agent(
+        model="openai:gpt-4o-mini",
+        tools=[get_shell_tool()],
+        prompt="""
+Apply best‑practice hardening (`ulimit`, `sysctl`, journald rotation) in dry‑run mode unless severity is High.
+""",
+        name="harmonizer_worker"
+    )
--- a/multi-agent-supervisor/agents/network_agents.py
+++ b/multi-agent-supervisor/agents/network_agents.py
@@ -0,0 +1,29 @@
+"""Network and security monitoring agents."""
+
+from langgraph.prebuilt import create_react_agent
+from custom_tools import get_shell_tool
+
+
+def create_network_worker():
+    """Create network diagnostics agent."""
+    return create_react_agent(
+        model="openai:gpt-4o-mini",
+        tools=[get_shell_tool()],
+        prompt="""
+Diagnose network issues using `ping`, `traceroute`, and `dig`.
+""",
+        name="network_diag"
+    )
+
+
+def create_cert_worker():
+    """Create certificate checking agent."""
+    return create_react_agent(
+        model="openai:gpt-4o-mini",
+        tools=[get_shell_tool()],
+        prompt="""
+Check TLS certificates on disk with `openssl x509 -noout -enddate -in <cert>`.
+Raise an alert when a certificate expires in fewer than 30 days.
+""",
+        name="cert_checker"
+    )
--- a/multi-agent-supervisor/agents/service_agents.py
+++ b/multi-agent-supervisor/agents/service_agents.py
@@ -0,0 +1,42 @@
+"""Service-specific monitoring agents."""
+
+from langgraph.prebuilt import create_react_agent
+from custom_tools import get_shell_tool, LogTailTool
+
+
+def create_mariadb_worker():
+    """Create MariaDB analysis agent."""
+    return create_react_agent(
+        model="openai:gpt-4o-mini",
+        tools=[get_shell_tool(), LogTailTool()],
+        prompt="""
+You are a MariaDB expert. Check config files in /etc/mysql and inspect `/var/log/mysql/*.log` for errors.
+Use `mysqladmin status` and other read‑only commands. Use the `tail_log` tool for logs.
+""",
+        name="mariadb_analyzer"
+    )
+
+
+def create_nginx_worker():
+    """Create Nginx analysis agent."""
+    return create_react_agent(
+        model="openai:gpt-4o-mini",
+        tools=[get_shell_tool(), LogTailTool()],
+        prompt="""
+You are an Nginx expert. Validate configuration with `nginx -t` and inspect access/error logs.
+Use the `tail_log` tool for `/var/log/nginx/error.log`.
+""",
+        name="nginx_analyzer"
+    )
+
+
+def create_phpfpm_worker():
+    """Create PHP-FPM analysis agent."""
+    return create_react_agent(
+        model="openai:gpt-4o-mini",
+        tools=[get_shell_tool(), LogTailTool()],
+        prompt="""
+You are a PHP‑FPM expert. Check `systemctl status php*-fpm` and look for memory leaks or timeouts in the logs.
+""",
+        name="phpfpm_analyzer"
+    )
--- a/multi-agent-supervisor/agents/system_agents.py
+++ b/multi-agent-supervisor/agents/system_agents.py
@@ -0,0 +1,30 @@
+"""System monitoring agents."""
+
+from langgraph.prebuilt import create_react_agent
+from custom_tools import get_shell_tool
+
+
+def create_system_info_worker():
+    """Create system information gathering agent."""
+    return create_react_agent(
+        model="openai:gpt-4o-mini",
+        tools=[get_shell_tool()],
+        prompt="""
+You are a Linux sysadmin. Use shell commands like `lscpu`, `free -h`, and `df -h` to gather CPU, RAM, and disk usage. 
+Return a concise plain‑text summary. Only run safe, read‑only commands.
+""",
+        name="system_info_worker"
+    )
+
+
+def create_service_inventory_worker():
+    """Create service inventory agent."""
+    return create_react_agent(
+        model="openai:gpt-4o-mini",
+        tools=[get_shell_tool()],
+        prompt="""
+List all running services using `systemctl list-units --type=service --state=running`. 
+Return a JSON array of service names.
+""",
+        name="service_inventory_worker"
+    )
--- a/multi-agent-supervisor/config.py
+++ b/multi-agent-supervisor/config.py
@@ -0,0 +1,26 @@
+"""Configuration settings for the multi-agent system."""
+
+from langchain_openai import ChatOpenAI
+
+
+def get_base_model():
+    """Get the base LLM model configuration."""
+    return ChatOpenAI(model="gpt-4o-mini", temperature=0)
+
+
+SUPERVISOR_PROMPT = """
+You are the supervisor of a team of specialised sysadmin agents.
+Decide which agent to delegate to based on the user's query **or** on results already collected.
+Available agents:
+- system_info_worker: gather system metrics
+- service_inventory_worker: list running services  
+- mariadb_analyzer: analyse MariaDB
+- nginx_analyzer: analyse Nginx
+- phpfpm_analyzer: analyse PHP‑FPM
+- network_diag: diagnose network issues
+- cert_checker: check TLS certificates
+- risk_scorer: aggregate severity
+- remediation_worker: propose fixes
+- harmonizer_worker: apply hardening
+Always start with `system_info_worker` and `service_inventory_worker` before drilling into a specific service.
+"""
--- a/multi-agent-supervisor/custom_tools/init.py
+++ b/multi-agent-supervisor/custom_tools/init.py
@@ -0,0 +1,6 @@
+"""Custom tools for the multi-agent sysadmin system."""
+
+from .log_tail_tool import LogTailTool
+from .shell_tool_wrapper import get_shell_tool
+
+__all__ = ["LogTailTool", "get_shell_tool"]
--- a/multi-agent-supervisor/custom_tools/log_tail_tool.py
+++ b/multi-agent-supervisor/custom_tools/log_tail_tool.py
@@ -0,0 +1,24 @@
+"""Log tail tool for reading log files."""
+
+import subprocess
+from langchain_core.tools import BaseTool
+
+
+class LogTailTool(BaseTool):
+    """Tail the last N lines from a log file."""
+
+    name: str = "tail_log"
+    description: str = "Tail the last N lines of a log file given its path and optional number of lines."
+
+    def _run(self, path: str, lines: int = 500):  # type: ignore[override]
+        """Run the tool to tail log files."""
+        try:
+            return subprocess.check_output(["tail", "-n", str(lines), path], text=True)
+        except subprocess.CalledProcessError as e:
+            return f"Error reading log file {path}: {e}"
+        except FileNotFoundError:
+            return f"Log file not found: {path}"
+
+    async def _arun(self, *args, **kwargs):  # noqa: D401
+        """Async version not implemented."""
+        raise NotImplementedError("Use the synchronous version of this tool.")
--- a/multi-agent-supervisor/custom_tools/shell_tool_wrapper.py
+++ b/multi-agent-supervisor/custom_tools/shell_tool_wrapper.py
@@ -0,0 +1,8 @@
+"""Shell tool wrapper for consistent access."""
+
+from langchain_community.tools import ShellTool
+
+
+def get_shell_tool() -> ShellTool:
+    """Get a configured shell tool instance."""
+    return ShellTool()
--- a/multi-agent-supervisor/examples.py
+++ b/multi-agent-supervisor/examples.py
--- a/multi-agent-supervisor/loghub
+++ b/multi-agent-supervisor/loghub
@@ -0,0 +1 @@
+../loghub
--- a/multi-agent-supervisor/main-multi-agent.py
+++ b/multi-agent-supervisor/main-multi-agent.py
@@ -0,0 +1,68 @@
+# Multi-agent sysadmin assistant using LangChain + LangGraph Supervisor
+# Requires: `pip install langchain-openai langgraph langgraph-supervisor`
+
+from __future__ import annotations
+
+from supervisor import create_sysadmin_supervisor
+from utils import print_step_info, explain_supervisor_pattern
+
+if __name__ == "__main__":
+    # Create the supervisor
+    supervisor = create_sysadmin_supervisor()
+    
+    # Example run - demonstrating both invoke and streaming with debug output
+    query = {
+        "messages": [
+            {
+                "role": "user",
+                "content": "Nginx returns 502 Bad Gateway on my server. What can I do?",
+            }
+        ]
+    }
+    
+    print("🚀 Starting multi-agent sysadmin analysis...")
+    print(f"📝 User Query: {query['messages'][0]['content']}")
+    print("=" * 80)
+    
+    # Show explanation of the supervisor pattern
+    explain_supervisor_pattern()
+    
+    print("\n=== Using invoke() method ===")
+    result = supervisor.invoke(query)
+    
+    print("\n📊 FINAL RESULT:")
+    print("-" * 40)
+    print(result["messages"][-1].content)
+    print("-" * 40)
+    
+    print(f"\n📈 Total messages exchanged: {len(result['messages'])}")
+    
+    print("\n=== Using stream() method for detailed step-by-step analysis ===")
+    step_count = 0
+    max_steps = 20  # Prevent infinite loops
+    
+    try:
+        chunks_processed = []
+        for chunk in supervisor.stream(query):
+            step_count += 1
+            chunks_processed.append(chunk)
+            print_step_info(step_count, chunk)
+            
+            # Safety check to prevent infinite loops
+            if step_count >= max_steps:
+                print(f"\n⚠️ Reached maximum steps ({max_steps}), stopping stream...")
+                break
+                
+        print(f"\n✅ Streaming completed successfully with {step_count} steps")
+        print(f"📊 Total chunks processed: {len(chunks_processed)}")
+        
+        # Check if the last chunk contains a complete final response
+        if chunks_processed:
+            last_chunk = chunks_processed[-1]
+            print(f"🔍 Last chunk keys: {list(last_chunk.keys()) if isinstance(last_chunk, dict) else type(last_chunk)}")
+        
+    except Exception as e:
+        print(f"\n❌ Streaming error after {step_count} steps: {e}")
+        print("💡 The invoke() method worked fine, so the supervisor itself is functional.")
+        import traceback
+        traceback.print_exc()
--- a/multi-agent-supervisor/supervisor.py
+++ b/multi-agent-supervisor/supervisor.py
@@ -0,0 +1,37 @@
+"""Multi-agent supervisor for sysadmin tasks."""
+
+from langchain_openai import ChatOpenAI
+from langgraph_supervisor import create_supervisor
+
+from agents.system_agents import create_system_info_worker, create_service_inventory_worker
+from agents.service_agents import create_mariadb_worker, create_nginx_worker, create_phpfpm_worker
+from agents.network_agents import create_network_worker, create_cert_worker
+from agents.analysis_agents import create_risk_worker, create_remediation_worker, create_harmonizer_worker
+from config import get_base_model, SUPERVISOR_PROMPT
+
+
+def create_sysadmin_supervisor():
+    """Create a supervisor that coordinates sysadmin agents."""
+    
+    # Create all the specialized agents
+    agents = [
+        create_system_info_worker(),
+        create_service_inventory_worker(),
+        create_mariadb_worker(),
+        create_nginx_worker(),
+        create_phpfpm_worker(),
+        create_network_worker(),
+        create_cert_worker(),
+        create_risk_worker(),
+        create_remediation_worker(),
+        create_harmonizer_worker(),
+    ]
+    
+    # Create and return the supervisor
+    supervisor = create_supervisor(
+        agents=agents,
+        model=get_base_model(),
+        prompt=SUPERVISOR_PROMPT
+    )
+    
+    return supervisor.compile()
--- a/multi-agent-supervisor/utils.py
+++ b/multi-agent-supervisor/utils.py
@@ -0,0 +1,142 @@
+"""Utility functions for the multi-agent system."""
+
+
+def explain_supervisor_pattern():
+    """Explain how the LangGraph supervisor pattern works."""
+    print("🏗️  MULTI-AGENT SUPERVISOR PATTERN EXPLANATION:")
+    print("=" * 60)
+    print("1. 🎯 SUPERVISOR: Receives user query and decides which agent to delegate to")
+    print("2. 🔄 TRANSFER: Uses transfer tools (e.g., transfer_to_system_info_worker)")
+    print("3. 🤖 AGENT: Specialized agent executes its task with its own prompt/tools")
+    print("4. 🔙 RETURN: Agent uses transfer_back_to_supervisor when done")
+    print("5. 🧠 DECISION: Supervisor analyzes results and decides next agent or final response")
+    print()
+    print("📋 WHAT 'Successfully transferred' MEANS:")
+    print("   - It's the response from a transfer tool call")
+    print("   - Indicates control handoff between supervisor and agent")
+    print("   - Each agent gets the full conversation context")
+    print("   - Agent's prompt guides how it processes that context")
+    print()
+    print("🔍 SUPERVISOR PROMPT (from config.py):")
+    print("   - Defines available agents and their specialties")
+    print("   - Guides delegation strategy (start with system_info & service_inventory)")
+    print("   - Agent prompts are in agents/*.py files")
+    print("=" * 60)
+    print()
+
+
+def print_step_info(step_count: int, chunk):
+    """Print formatted step information during streaming."""
+    print(f"\n🔄 STEP {step_count}:")
+    print("-" * 30)
+    
+    try:
+        # Extract agent information from chunk
+        if isinstance(chunk, dict):
+            # Look for agent names in the chunk keys
+            agent_names = [key for key in chunk.keys() if key in [
+                'system_info_worker', 'service_inventory_worker', 'mariadb_analyzer',
+                'nginx_analyzer', 'phpfpm_analyzer', 'network_diag', 'cert_checker',
+                'risk_scorer', 'remediation_worker', 'harmonizer_worker', 'supervisor'
+            ]]
+            
+            if agent_names:
+                current_agent = agent_names[0]
+                print(f"🤖 ACTIVE AGENT: {current_agent}")
+                
+                # Show the messages from this agent
+                agent_data = chunk[current_agent]
+                if 'messages' in agent_data:
+                    messages = agent_data['messages']
+                    if messages:
+                        last_message = messages[-1]
+                        # Get message type from the class name
+                        message_type = type(last_message).__name__
+                        print(f"💬 MESSAGE TYPE: {message_type}")
+                        
+                        # Show content preview if available
+                        if hasattr(last_message, 'content') and last_message.content:
+                            content = last_message.content
+                            content_length = len(content)
+                            print(f"📏 CONTENT LENGTH: {content_length} characters")
+                            
+                            # Show full content for final AI responses, abbreviated for others
+                            if message_type == 'AIMessage':
+                                print(f"📄 FULL CONTENT:")
+                                print(content)
+                                print()  # Extra line for readability
+                            else:
+                                # Truncate other message types for brevity
+                                preview = content[:200] + "..." if len(content) > 200 else content
+                                print(f"📄 CONTENT PREVIEW:")
+                                print(preview)
+                                print()  # Extra line for readability
+                        
+                        # Show tool calls if any
+                        if hasattr(last_message, 'tool_calls') and last_message.tool_calls:
+                            tool_calls = last_message.tool_calls
+                            print(f"🔧 TOOL CALLS: {len(tool_calls)} tool(s)")
+                            for i, tool_call in enumerate(tool_calls):
+                                tool_name = getattr(tool_call, 'name', 'unknown')
+                                print(f"   {i+1}. {tool_name}")
+                                # Show transfer details for supervisor delegation
+                                if tool_name.startswith('transfer_to_'):
+                                    target_agent = tool_name.replace('transfer_to_', '')
+                                    print(f"      🎯 DELEGATING to: {target_agent}")
+                                    # Show the arguments/context being passed
+                                    if hasattr(tool_call, 'args') and tool_call.args:
+                                        print(f"      📋 Context/Args: {tool_call.args}")
+                        
+                        # Show additional info for ToolMessage
+                        if message_type == 'ToolMessage':
+                            if hasattr(last_message, 'name'):
+                                tool_name = last_message.name
+                                print(f"🔧 TOOL NAME: {tool_name}")
+                                
+                                # Explain what "Successfully transferred" means
+                                if "transfer" in tool_name and "Successfully transferred" in content:
+                                    if tool_name.startswith('transfer_to_'):
+                                        target_agent = tool_name.replace('transfer_to_', '')
+                                        print(f"   ℹ️  EXPLANATION: Supervisor delegated control to {target_agent}")
+                                        print(f"   ℹ️  The {target_agent} will now execute its specialized tasks")
+                                    elif tool_name == 'transfer_back_to_supervisor':
+                                        print(f"   ℹ️  EXPLANATION: Agent completed its task and returned control to supervisor")
+                                        print(f"   ℹ️  Supervisor will decide the next step based on results")
+                                        
+                            if hasattr(last_message, 'tool_call_id'):
+                                print(f"🔧 TOOL CALL ID: {last_message.tool_call_id}")
+                
+                # Show conversation context for better understanding
+                agent_data = chunk[current_agent]
+                if 'messages' in agent_data and len(agent_data['messages']) > 1:
+                    print(f"\n📚 CONVERSATION CONTEXT ({len(agent_data['messages'])} messages):")
+                    for i, msg in enumerate(agent_data['messages'][-3:], start=max(0, len(agent_data['messages'])-3)):
+                        msg_type = type(msg).__name__
+                        if hasattr(msg, 'content') and msg.content:
+                            preview = msg.content[:100].replace('\n', ' ')
+                            if len(msg.content) > 100:
+                                preview += "..."
+                            print(f"   {i+1}. {msg_type}: {preview}")
+                        elif hasattr(msg, 'tool_calls') and msg.tool_calls:
+                            tool_names = [getattr(tc, 'name', 'unknown') for tc in msg.tool_calls]
+                            print(f"   {i+1}. {msg_type}: Tool calls: {tool_names}")
+                        else:
+                            print(f"   {i+1}. {msg_type}: (no content)")
+                            
+                print()  # Extra spacing for readability
+            else:
+                print("📋 CHUNK DATA:")
+                # Show first few keys for debugging
+                chunk_keys = list(chunk.keys())[:3]
+                print(f"   Keys: {chunk_keys}")
+        else:
+            print(f"📦 CHUNK TYPE: {type(chunk)}")
+            print(f"📄 CONTENT: {str(chunk)[:100]}...")
+    
+    except Exception as e:
+        print(f"❌ Error processing chunk: {e}")
+        print(f"📦 CHUNK TYPE: {type(chunk)}")
+        if hasattr(chunk, '__dict__'):
+            print(f"📄 CHUNK ATTRIBUTES: {list(chunk.__dict__.keys())}")
+    
+    print("-" * 30)