update README

This commit is contained in:
Gaetan Hurel 2025-06-25 15:27:53 +02:00
parent 2f9beb96cb
commit 19e9e12235
No known key found for this signature in database

230
README.md
View File

@ -1,118 +1,148 @@
# LangGraph Basic Chatbot
# 🤖 LangGraph Sysadmin Debugging Agent
A basic chatbot built with LangGraph following the official tutorial. This chatbot uses OpenAI's GPT-4 model to provide conversational AI capabilities.
A LangGraph-powered AI agent designed to assist system administrators in their daily debugging tasks by analyzing log files and executing shell commands with intelligent reasoning.
## Features
## 🛠️ Technology Stack
- 🤖 Basic conversational AI using LangGraph
- 🔄 State management with message history
- 🚀 Built with uv for fast dependency management
- 🔐 Secure API key handling
This is a **LangGraph program** (AI agent) that combines:
## Prerequisites
- **LangGraph**: State-based AI agent framework for building conversational AI workflows
- **OpenAI GPT-4o-mini**: Large Language Model for intelligent reasoning and tool usage
- **LangChain Tools**:
- [**ShellTool** (prebuilt)](https://python.langchain.com/api_reference/community/tools/langchain_community.tools.shell.tool.ShellTool.html): Executes shell commands for system investigation
- **log_analyzer** (custom tool): Structured log file analysis with pattern recognition
- [**Loghub Dataset**](https://github.com/logpai/loghub): Comprehensive collection of real-world system logs as git submodule
- Python 3.8 or higher
- [uv](https://docs.astral.sh/uv/) package manager
## 🎯 Agent Goals
This agent helps sysadmins by:
- **Log Analysis**: Automatically detect error patterns, frequency anomalies, and timeline issues
- **Shell Operations**: Execute diagnostic commands (`grep`, `awk`, `tail`, `ps`, `netstat`, etc.)
- **Pattern Recognition**: Identify common system issues across different log types
- **Interactive Debugging**: Maintain conversation context for multi-step troubleshooting
- **Knowledge Transfer**: Demonstrate best practices for log analysis and system debugging
## 📊 Dataset
The agent uses the **Loghub** repository as a git submodule, providing access to:
- **Distributed Systems**: HDFS, Hadoop, Spark, Zookeeper, OpenStack
- **Supercomputers**: BGL, HPC, Thunderbird
- **Operating Systems**: Windows, Linux, Mac
- **Mobile Systems**: Android, HealthApp
- **Server Applications**: Apache, OpenSSH
- **Standalone Software**: Proxifier
## 🚀 Setup Instructions
### Prerequisites
- Python 3.8+
- OpenAI API key
- Git
## Setup
### Installation
### 1. Clone and Navigate to Project
1. **Clone the repository with submodules:**
```bash
git clone --recurse-submodules https://github.com/your-username/langgraph-pard0x.git
cd langgraph-pard0x
```
2. **Install dependencies:**
```bash
# Using uv (recommended)
uv sync
# Or using pip
pip install -r requirements.txt
```
3. **Set up OpenAI API key:**
```bash
export OPENAI_API_KEY='your-api-key-here'
# Or create a .env file
echo "OPENAI_API_KEY=your-api-key-here" > .env
```
4. **Initialize the loghub submodule (if not cloned with --recurse-submodules):**
```bash
git submodule update --init --recursive
```
### Running the Agent
```bash
cd /Users/ghsioux/tmp/langgraph-pard0x
python main.py
```
### 2. Install Dependencies
## 💡 Usage Examples
Dependencies are already installed with uv. The virtual environment is automatically managed.
### 3. Set up OpenAI API Key
You have several options:
#### Option A: Use the setup script
```bash
uv run setup_env.py
```
#### Option B: Set environment variable manually
```bash
export OPENAI_API_KEY='your-api-key-here'
```
#### Option C: Create a .env file
Create a `.env` file in the project root:
```
OPENAI_API_KEY=your-api-key-here
```
### 4. Run the Chatbot
```bash
uv run main.py
```
## Usage
1. Start the chatbot with `uv run main.py`
2. Type your messages and press Enter
3. The bot will respond using GPT-4
4. Type `quit`, `exit`, or `q` to end the conversation
### Example Conversation
### Multi-steps multi-tools debugging:
```
🤖 LangGraph Basic Chatbot
Type 'quit', 'exit', or 'q' to exit the chat.
----------------------------------------
✅ Chatbot initialized successfully!
User: Where is the log file named Linux_2k.log on my system?
Agent: I'll search the file Linux_2k.log on your system and return its path.
[Executes shell tool to `find / -name "Linux_2k.log"]
User: Hello! What is LangGraph?
Assistant: LangGraph is a library designed to help build stateful multi-agent applications using language models...
User: Analyze this log file and tell me if there are any issues or anomalies on my system
Agent:
[Use log analysis tools on Linux_2k.log]
User: quit
👋 Goodbye!
```
## Project Structure
### Specific Analysis Types
```
User: Get a frequency analysis of Apache error patterns
Agent: [Uses analyze_log_file with analysis_type="frequency" on Apache logs]
User: Show me timeline patterns in Hadoop logs
Agent: [Uses analyze_log_file with analysis_type="timeline" on Hadoop logs]
User: Give me a summary of the Windows event logs
Agent: [Uses analyze_log_file with analysis_type="summary" on Windows logs]
```
### Combined Approach
```
User: Find all critical errors in the system and suggest fixes
Agent:
1. [Analyzes multiple log files for error patterns]
2. [Executes shell commands to gather system state]
3. [Provides structured analysis and recommendations]
```
## 🔧 Available Analysis Types
The custom `log_analyzer` tool supports:
- **error_patterns**: Detects error keywords (error, fail, exception, critical, fatal, denied, refused, timeout)
- **frequency**: Identifies most common log patterns by normalizing timestamps, IPs, and UUIDs
- **timeline**: Extracts and analyzes timestamp patterns for chronological debugging
- **summary**: Provides basic statistics and sample content overview
## 📁 Project Structure
```
langgraph-pard0x/
├── main.py # Main chatbot implementation
├── setup_env.py # Environment setup helper
├── pyproject.toml # Project configuration and dependencies
├── README.md # This file
└── .venv/ # Virtual environment (auto-managed by uv)
├── main.py # Main LangGraph agent
├── log_analyzer.py # Custom log analysis tool
├── loghub/ # Git submodule with log datasets
│ ├── Linux/
│ ├── Apache/
│ ├── OpenSSH/
│ └── ...
├── pyproject.toml # Project dependencies
└── README.md # This file
```
## Dependencies
## ⚠️ Safety Note
- **langgraph**: Graph-based workflow library for LLM applications
- **langchain**: Framework for developing applications with LLMs
- **langchain-openai**: OpenAI integration for LangChain
- **langsmith**: Observability and debugging for LangChain applications
## How It Works
The chatbot uses LangGraph's `StateGraph` to manage conversation state:
1. **State Definition**: Uses a `TypedDict` with `messages` field that appends new messages
2. **Graph Creation**: Creates a state graph with a single "chatbot" node
3. **LLM Integration**: Uses `init_chat_model` to initialize OpenAI's GPT-4
4. **Message Processing**: The chatbot node processes incoming messages and returns responses
5. **Streaming**: Responses are streamed back to the user in real-time
## Next Steps
This is the foundation for more advanced LangGraph features:
- Add web search tools
- Implement memory persistence
- Add human-in-the-loop capabilities
- Create multi-agent workflows
Check out the [LangGraph tutorials](https://langchain-ai.github.io/langgraph/tutorials/) for more advanced features!
This agent has shell access for diagnostic purposes. Use with caution and only in safe environments. The agent is designed to help with debugging, not to make system modifications.
## Troubleshooting
@ -125,6 +155,20 @@ Check out the [LangGraph tutorials](https://langchain-ai.github.io/langgraph/tut
- Verify your OpenAI API key has sufficient credits
- Make sure all dependencies are installed correctly
### Import Errors
- Run `uv sync` to ensure all dependencies are properly installed
- Check that you're using the virtual environment created by uv
### Loghub submodule not found
- Run `git submodule update --init --recursive` to initialize the submodule
- Ensure you have proper git access to the loghub repository
## 🤔 Open Questions
This project raises several interesting technical and architectural questions worth exploring:
### Conversation History Management
- **Should we pass [all the conversation history](https://www.reddit.com/r/LangChain/comments/1f3nqud/sending_the_entire_conversation_on_each_call_to/) to the LLM or not?**
### Framework Classification
- **Is this a LangChain program or a [LangChain agent](https://langchain-ai.github.io/langgraph/agents/overview/)?**
### Tool Architecture Decisions
- **When to use LangChain [prebuilt tools and custom tools](https://langchain-ai.github.io/langgraph/agents/tools/#prebuilt-tools) vs. [MCP (Model Context Protocol) integrations](https://langchain-ai.github.io/langgraph/agents/mcp/)?**