# 🤖 LangGraph Sysadmin Debugging Agent A LangGraph-powered AI agent designed to assist system administrators in their daily debugging tasks by analyzing log files and executing shell commands with intelligent reasoning. ## 🛠️ Technology Stack This is a **LangGraph program** (AI agent) that combines: - **LangGraph**: State-based AI agent framework for building conversational AI workflows - **OpenAI GPT-4o-mini**: Large Language Model for intelligent reasoning and tool usage - **LangChain Tools**: - [**ShellTool** (prebuilt)](https://python.langchain.com/api_reference/community/tools/langchain_community.tools.shell.tool.ShellTool.html): Executes shell commands for system investigation - **log_analyzer** (custom tool): Structured log file analysis with pattern recognition - [**Loghub Dataset**](https://github.com/logpai/loghub): Comprehensive collection of real-world system logs as git submodule ## 🎯 Agent Goals This agent helps sysadmins by: - **Log Analysis**: Automatically detect error patterns, frequency anomalies, and timeline issues - **Shell Operations**: Execute diagnostic commands (`grep`, `awk`, `tail`, `ps`, `netstat`, etc.) - **Pattern Recognition**: Identify common system issues across different log types - **Interactive Debugging**: Maintain conversation context for multi-step troubleshooting - **Knowledge Transfer**: Demonstrate best practices for log analysis and system debugging ## 📊 Dataset The agent uses the **Loghub** repository as a git submodule, providing access to: - **Distributed Systems**: HDFS, Hadoop, Spark, Zookeeper, OpenStack - **Supercomputers**: BGL, HPC, Thunderbird - **Operating Systems**: Windows, Linux, Mac - **Mobile Systems**: Android, HealthApp - **Server Applications**: Apache, OpenSSH - **Standalone Software**: Proxifier ## 🚀 Setup Instructions ### Prerequisites - Python 3.8+ - OpenAI API key - Git ### Installation 1. **Clone the repository with submodules:** ```bash git clone --recurse-submodules https://github.com/your-username/langgraph-pard0x.git cd langgraph-pard0x ``` 2. **Install dependencies:** ```bash # Using uv (recommended) uv sync # Or using pip pip install -r requirements.txt ``` 3. **Set up OpenAI API key:** ```bash export OPENAI_API_KEY='your-api-key-here' # Or create a .env file echo "OPENAI_API_KEY=your-api-key-here" > .env ``` 4. **Initialize the loghub submodule (if not cloned with --recurse-submodules):** ```bash git submodule update --init --recursive ``` ### Running the Agent ```bash python main.py ``` ## 💡 Usage Examples ### Multi-steps multi-tools debugging: ``` User: Where is the log file named Linux_2k.log on my system? Agent: I'll search the file Linux_2k.log on your system and return its path. [Executes shell tool to `find / -name "Linux_2k.log"] User: Analyze this log file and tell me if there are any issues or anomalies on my system Agent: [Use log analysis tools on Linux_2k.log] ``` ### Specific Analysis Types ``` User: Get a frequency analysis of Apache error patterns Agent: [Uses analyze_log_file with analysis_type="frequency" on Apache logs] User: Show me timeline patterns in Hadoop logs Agent: [Uses analyze_log_file with analysis_type="timeline" on Hadoop logs] User: Give me a summary of the Windows event logs Agent: [Uses analyze_log_file with analysis_type="summary" on Windows logs] ``` ### Combined Approach ``` User: Find all critical errors in the system and suggest fixes Agent: 1. [Analyzes multiple log files for error patterns] 2. [Executes shell commands to gather system state] 3. [Provides structured analysis and recommendations] ``` ## 🔧 Available Analysis Types The custom `log_analyzer` tool supports: - **error_patterns**: Detects error keywords (error, fail, exception, critical, fatal, denied, refused, timeout) - **frequency**: Identifies most common log patterns by normalizing timestamps, IPs, and UUIDs - **timeline**: Extracts and analyzes timestamp patterns for chronological debugging - **summary**: Provides basic statistics and sample content overview ## 📁 Project Structure ``` langgraph-pard0x/ ├── main.py # Main LangGraph agent ├── log_analyzer.py # Custom log analysis tool ├── loghub/ # Git submodule with log datasets │ ├── Linux/ │ ├── Apache/ │ ├── OpenSSH/ │ └── ... ├── pyproject.toml # Project dependencies └── README.md # This file ``` ## ⚠️ Safety Note This agent has shell access for diagnostic purposes. Use with caution and only in safe environments. The agent is designed to help with debugging, not to make system modifications. ## Troubleshooting ### "Please set your OPENAI_API_KEY environment variable" - Make sure you've set your OpenAI API key using one of the methods above - Verify your API key is valid and active ### "Error initializing chatbot" - Check your internet connection - Verify your OpenAI API key has sufficient credits - Make sure all dependencies are installed correctly ### Loghub submodule not found - Run `git submodule update --init --recursive` to initialize the submodule - Ensure you have proper git access to the loghub repository ## 🤔 Open Questions This project raises several interesting technical and architectural questions worth exploring: ### Conversation History Management - **Should we pass [all the conversation history](https://www.reddit.com/r/LangChain/comments/1f3nqud/sending_the_entire_conversation_on_each_call_to/) to the LLM or not?** ### Framework Classification - **Is this a LangChain program or a [LangChain agent](https://langchain-ai.github.io/langgraph/agents/overview/)?** ### Tool Architecture Decisions - **When to use LangChain [prebuilt tools and custom tools](https://langchain-ai.github.io/langgraph/agents/tools/#prebuilt-tools) vs. [MCP (Model Context Protocol) integrations](https://langchain-ai.github.io/langgraph/agents/mcp/)?**