Building an AI Agent with LangChain and OpenAI
Introduction
AI agents represent a significant leap forward in how we interact with AI systems. Unlike traditional chatbots that simply generate responses, agents can reason about tasks, select appropriate tools, execute actions, and synthesize results. In this tutorial, we'll explore how to build a simple but powerful AI agent using LangChain and OpenAI that can answer questions using Wikipedia and perform mathematical calculations.
The agent we'll examine uses the ReAct (Reasoning + Acting) pattern, which allows it to:
- Reason about what tool to use for a given question
- Act by calling the selected tool
- Observe the tool's output
- Respond with a synthesized answer
Architecture Overview
The AI agent follows a modular architecture:
- Language Model (LLM): OpenAI's GPT model that powers the agent's reasoning
- Agent Framework: LangChain's zero-shot ReAct agent that orchestrates tool selection
- Tools: Custom tools that the agent can use (Wikipedia search and calculator)
- Tool Tracking: Callback system to monitor agent behavior and statistics
Core Components
1. Agent Initialization
The agent is set up using LangChain's initialize_agent function with a zero-shot ReAct description agent type:
from langchain_openai import OpenAI
from langchain.agents import initialize_agent, Tool, AgentType
from langchain.tools import BaseTool
# Initialize the LLM
llm = OpenAI(temperature=0, callbacks=[tool_tracker])
# Wrap custom tools into LangChain's Tool format
wiki_tool = WikipediaTool()
calc_tool = CalculatorTool()
tools = [
Tool(
name=wiki_tool.name,
func=wiki_tool.run,
description=wiki_tool.description
),
Tool(
name=calc_tool.name,
func=calc_tool.run,
description=calc_tool.description
),
]
# Create the agent
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
handle_parsing_errors=True,
callbacks=[tool_tracker]
)Key Components Explained:
AgentType.ZERO_SHOT_REACT_DESCRIPTION: This agent type uses the ReAct pattern without requiring examples. It reads tool descriptions and decides which tool to use based on the question.temperature=0: Set to 0 for deterministic, consistent responsesverbose=True: Enables detailed logging of the agent's reasoning processhandle_parsing_errors=True: Gracefully handles cases where the agent's output doesn't match expected format
How ReAct Works:
The ReAct pattern follows this loop:
- Thought: The agent analyzes the question and decides what to do
- Action: The agent selects a tool and prepares the input
- Observation: The tool executes and returns a result
- Thought: The agent processes the observation
- Final Answer: The agent synthesizes the information and responds
2. Wikipedia Tool
The Wikipedia tool allows the agent to search Wikipedia for general knowledge questions:
from langchain.tools import BaseTool
import wikipedia
class WikipediaTool(BaseTool):
name: str = "wikipedia"
description: str = (
"Useful for answering general knowledge questions. "
"Input should be a single string; it returns the summary of the top page."
)
def _run(self, query: str) -> str:
"""
Searches Wikipedia for the given query and returns a summary.
Args:
query: The search term to look up on Wikipedia
Returns:
str: A two-sentence summary of the Wikipedia page or an error message
"""
try:
# Get a concise 2-sentence summary
summary = wikipedia.summary(query, sentences=2)
return summary
except Exception as e:
return f"❗ Could not fetch Wikipedia page for '{query}'. Error: {e}"How It Works:
- Inheritance: Inherits from
BaseTool, which provides the interface LangChain expects - Name & Description: The
nameanddescriptionare crucial - the agent uses these to decide when to use this tool -
_runMethod: This is where the actual tool logic lives. It:- Takes a search query string
- Uses the
wikipedialibrary to fetch a summary - Returns a concise 2-sentence summary
- Handles errors gracefully
Why Two Sentences?
Limiting to 2 sentences keeps responses concise and reduces token usage. The agent can always ask follow-up questions if more detail is needed.
3. Calculator Tool
The calculator tool safely evaluates mathematical expressions without using eval(), which would be a security risk:
import re
import ast
import operator
from typing import Dict, Any, Union, Type, Callable
class CalculatorTool(BaseTool):
name: str = "calculator"
description: str = (
"Useful for math questions. Input should be a valid math expression "
"(e.g., 16 * 4 + 3). For square roots, use 'x ** 0.5' instead of 'sqrt(x)'. "
"Do not use quotes around the expression."
)
def _run(self, expression: str) -> str:
"""Evaluates a simple math expression and returns the result."""
try:
# Strip any quotes from the expression
expression = expression.strip("'\"")
# Handle square root requests
if "square root" in expression.lower() or "sqrt" in expression.lower():
match = re.search(r'\d+', expression)
if match:
number = int(match.group())
return str(number ** 0.5)
else:
return "❗ Could not identify a number to calculate square root."
# Validate expression - only allow safe characters
if not re.fullmatch(r"[0-9+\-*/().\s\*]+", expression):
return "❗ Invalid characters in expression."
# Parse the expression into an Abstract Syntax Tree (AST)
node = ast.parse(expression, mode='eval')
# Define allowed operators (whitelist approach)
operators: Dict[Type[ast.operator], Callable] = {
ast.Add: operator.add,
ast.Sub: operator.sub,
ast.Mult: operator.mul,
ast.Div: operator.truediv,
ast.USub: operator.neg, # Unary minus
ast.Pow: operator.pow, # Power operator
}
# Recursively evaluate the AST
def eval_expr(node: Any) -> Union[int, float]:
if isinstance(node, ast.Num): # Number literal
return node.n
elif isinstance(node, ast.BinOp): # Binary operation (a + b)
if type(node.op) not in operators:
raise ValueError(f"Unsupported operation: {type(node.op).__name__}")
return operators[type(node.op)](
eval_expr(node.left),
eval_expr(node.right)
)
elif isinstance(node, ast.UnaryOp): # Unary operation (-a)
if type(node.op) not in operators:
raise ValueError(f"Unsupported operation: {type(node.op).__name__}")
return operators[type(node.op)](eval_expr(node.operand))
elif isinstance(node, ast.Expression): # Root expression node
return eval_expr(node.body)
else:
raise ValueError(f"Unsupported node type: {type(node).__name__}")
result = eval_expr(node)
return str(result)
except Exception as e:
return f"❗ Error evaluating expression: {e}"Security Features:
- Input Validation: Uses regex to ensure only safe characters are present
- AST Parsing: Converts the expression to an Abstract Syntax Tree instead of using
eval() - Operator Whitelisting: Only allows specific operations (add, subtract, multiply, divide, power, unary minus)
- Recursive Evaluation: Safely traverses the AST, only executing whitelisted operations
Why Not Use eval()?
Using eval() on user input is extremely dangerous. A malicious user could execute arbitrary Python code:
# DANGEROUS - Never do this!
result = eval(user_input) # Could execute: __import__('os').system('rm -rf /')
# SAFE - Our approach
node = ast.parse(user_input, mode='eval')
result = eval_expr(node) # Only evaluates whitelisted operationsAST Evaluation Process:
- Parse:
ast.parse()converts the string into a tree structure - Traverse: Recursively walk the tree
- Evaluate: For each node type, apply the corresponding operation
- Return: The final computed value
4. Tool Tracking with Callbacks
The agent includes a callback system to track tool usage and statistics:
from langchain.callbacks.base import BaseCallbackHandler
from typing import List, Optional, Any, Dict
from uuid import UUID
class ToolTracker(BaseCallbackHandler):
def __init__(self, stats):
super().__init__()
self.stats = stats
def on_tool_start(
self,
serialized: Dict[str, Any],
input_str: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any
) -> None:
tool_name = serialized.get("name", "")
if tool_name == "wikipedia":
self.stats["wiki_queries"] += 1
elif tool_name == "calculator":
self.stats["calculator_calls"] += 1
def on_llm_start(
self,
serialized: Dict[str, Any],
prompts: List[str],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any
) -> None:
self.stats["openai_queries"] += 1How Callbacks Work:
on_tool_start: Called whenever a tool is invoked. We track which tool is being used.on_llm_start: Called whenever the LLM makes a request. We count API calls for cost tracking.-
Statistics: The stats dictionary tracks usage patterns, which is useful for:
- Monitoring costs
- Understanding agent behavior
- Optimizing tool selection
5. Error Handling
The agent includes comprehensive error handling for various OpenAI API errors:
try:
answer = agent.invoke({"input": question})["output"]
print(f"Answer:\n{answer}")
except openai.AuthenticationError:
print("Error: Authentication failed. Check your OpenAI API key.")
except openai.RateLimitError:
print("Error: OpenAI API rate limit exceeded. Please try again later.")
except openai.APIError as e:
print(f"OpenAI API Error: {e}")
except openai.APIConnectionError:
print("Error: OpenAI service is currently unavailable. Please try again later.")
except Exception as e:
print(f"An unexpected error occurred: {e}")Error Types Handled:
AuthenticationError: Invalid or missing API keyRateLimitError: Too many requests (should implement retry with backoff in production)APIError: General API errorsAPIConnectionError: Network connectivity issues- Generic
Exception: Catches any unexpected errors
Example Interactions
Let's see how the agent handles different types of questions:
Example 1: General Knowledge Question
question = "Who was Ada Lovelace and why is she important?"
# Agent's reasoning process (with verbose=True):
# Thought: I need to find information about Ada Lovelace. This is a general knowledge question.
# Action: Use Wikipedia tool
# Action Input: Ada Lovelace
# Observation: Ada Lovelace was an English mathematician and writer...
# Thought: I have enough information to answer the question.
# Final Answer: Ada Lovelace was an English mathematician and writer...
answer = agent.invoke({"input": question})["output"]What Happens:
- Agent recognizes this is a factual question
- Selects the Wikipedia tool
- Searches for "Ada Lovelace"
- Receives a summary from Wikipedia
- Synthesizes the answer
Example 2: Mathematical Calculation
question = "Calculate 17 * (24 - 5)"
# Agent's reasoning process:
# Thought: This is a mathematical calculation. I should use the calculator tool.
# Action: Use calculator tool
# Action Input: 17 * (24 - 5)
# Observation: 323
# Thought: The calculation is complete.
# Final Answer: The result is 323.
answer = agent.invoke({"input": question})["output"]What Happens:
- Agent identifies this as a math problem
- Selects the calculator tool
- Passes the expression to the calculator
- Calculator safely evaluates:
17 * (24 - 5) = 17 * 19 = 323 - Returns the result
Example 3: Square Root Calculation
question = "Calculate the square root of 144"
# The calculator tool handles this specially:
# - Detects "square root" in the expression
# - Extracts the number (144)
# - Calculates: 144 ** 0.5 = 12.0
# - Returns: "12.0"Running the Agent
Here's how to set up and run the agent:
1. Installation
# Clone the repository
git clone https://github.com/nyandiekaFelix/ai-agent
cd ai-agent
# Create virtual environment
python3 -m venv myvenv
source myvenv/bin/activate # On Windows: myvenv\Scripts\activate
# Install dependencies
pip install -r requirements.txt2. Configuration
Create a .env file with your OpenAI API key:
OPENAI_API_KEY="sk-your_openai_api_key_here"Or export it in your shell:
export OPENAI_API_KEY="sk-your_openai_api_key_here"3. Execution
python -m simple_agent.agent_demoOutput Example:
Simple Agent Demo (LangChain with OpenAI Compatible LLM)
--------------------------------------------------------------------------------
Question: Who was Ada Lovelace and why is she important?
Answer:
Ada Lovelace was an English mathematician and writer, chiefly known for her work on Charles Babbage's proposed mechanical general-purpose computer, the Analytical Engine. She is considered by many to be the first computer programmer because she wrote the first algorithm intended to be processed by a machine.
--------------------------------------------------------------------------------
Question: Calculate 17 * (24 - 5)
Answer:
The result is 323.
--------------------------------------------------------------------------------
Statistics:
- Runtime..............: 12.45 seconds
- OpenAI API calls.....: 8
- Wikipedia tool calls.: 2
- Calculator tool calls: 2Extending the Agent
You can easily extend the agent by adding new tools:
Step 1: Create a New Tool
# tools.py
class WeatherTool(BaseTool):
name: str = "weather"
description: str = (
"Useful for getting current weather information. "
"Input should be a city name."
)
def _run(self, city: str) -> str:
# Your weather API logic here
try:
# Call weather API
weather_data = get_weather(city)
return f"The weather in {city} is {weather_data['temperature']}°C"
except Exception as e:
return f"❗ Could not fetch weather for {city}. Error: {e}"Step 2: Register the Tool
# agent_demo.py
from .tools import WikipediaTool, CalculatorTool, WeatherTool
# ... existing code ...
weather_tool = WeatherTool()
tools = [
Tool(name=wiki_tool.name, func=wiki_tool.run, description=wiki_tool.description),
Tool(name=calc_tool.name, func=calc_tool.run, description=calc_tool.description),
Tool(name=weather_tool.name, func=weather_tool.run, description=weather_tool.description),
]
# ... rest of the code ...The agent will automatically learn to use the new tool based on its description!
Best Practices
1. Tool Descriptions
Write clear, specific tool descriptions. The agent uses these to decide which tool to use:
# Good description
description = "Useful for math questions. Input should be a valid math expression."
# Bad description (too vague)
description = "Does calculations."2. Error Handling
Always handle errors gracefully in tools:
def _run(self, input: str) -> str:
try:
# Tool logic
return result
except Exception as e:
return f"❗ Error: {e}" # Return error message, don't raise3. Input Validation
Validate inputs in tools to prevent errors:
def _run(self, expression: str) -> str:
if not expression or not expression.strip():
return "❗ Empty expression provided."
# ... rest of logic4. Cost Monitoring
Use callbacks to track API usage:
stats = {"openai_queries": 0, "tool_calls": 0}
tracker = ToolTracker(stats)
llm = OpenAI(callbacks=[tracker])Limitations and Future Improvements
Current Limitations
- Single Tool per Question: The agent typically uses one tool per question. For complex queries requiring multiple tools, you'd need a more sophisticated agent.
- No Memory: The agent doesn't remember previous conversations. Each question is independent.
- LangChain Deprecation: LangChain agents are being phased out in favor of LangGraph.
Future Improvements
- LangGraph Migration: Migrate to LangGraph for more flexible agent workflows
- Conversation Memory: Add memory to maintain context across multiple turns
- Tool Chaining: Enable the agent to use multiple tools in sequence
- Streaming Responses: Stream agent responses for better user experience
- Retry Logic: Add exponential backoff for rate limit errors
Dependencies Breakdown
langchain: Core framework for building LLM applicationslangchain-openai: OpenAI integration for LangChainopenai: Official OpenAI Python clientwikipedia: Python wrapper for Wikipedia API
Conclusion
This AI agent demonstrates the power of combining LLMs with tools. By giving the agent access to Wikipedia and a calculator, we've created a system that can answer both factual and mathematical questions accurately.
Key Takeaways:
- Tool Descriptions Matter: The agent relies heavily on tool descriptions to make decisions
- Security First: Always validate and sanitize inputs, especially when evaluating code
- Error Handling: Graceful error handling improves user experience
- Monitoring: Track tool usage and API calls to understand costs and behavior
- Extensibility: The modular design makes it easy to add new capabilities
The agent pattern is powerful because it combines the reasoning capabilities of LLMs with the precision of specialized tools. As you extend this agent, consider:
- What tools would be useful for your use case?
- How can you improve tool descriptions?
- What error cases need handling?
- How can you optimize for cost and performance?
For production use, consider migrating to LangGraph, which offers more flexibility, better state management, and support for complex agent workflows.