MySQL MCP Server
An MCP server for MySQL databases that enables schema exploration, query execution, and resource access through a unified interface. It features a LangGraph-based agent that translates natural language into SQL queries with automatic error recovery and schema discovery.
README
MySQL MCP Server
A Model Context Protocol (MCP) server for MySQL databases. This server provides a unified interface for exploring and querying MySQL databases through MCP.
Features
- MySQL Support: Connect to MySQL databases
- Unified Interface: Consistent tools and API for MySQL operations
- Database-Specific Optimizations: Uses MySQL-optimized SQL syntax
- Schema Exploration: List databases, tables, and relationships
- Query Execution: Run SQL queries with proper parameter handling
- Resource Support: MCP resource endpoints for table data
- LangGraph Text-to-SQL Agent: Intelligent agent that converts natural language to SQL queries with automatic schema exploration and error recovery
Installation
- Install dependencies:
pip install -r requirements.txt
The requirements include:
- MySQL connector for database access
- MCP and FastMCP for the server
- LangChain and LangGraph for the text-to-SQL agent
- LangChain OpenAI integration for LLM support
Usage
Connection Strings
MySQL
# Using connection string
python mysql-db-server.py --conn "mysql://user:password@host:port/database"
# Using environment variable
export DATABASE_CONNECTION_STRING="mysql://user:password@host:port/database"
python mysql-db-server.py
Command Line Options
python mysql-db-server.py [OPTIONS]
Options:
--conn TEXT MySQL connection string (format: mysql://user:password@host:port/database)
--transport TEXT Transport protocol: stdio, sse, or streamable-http (default: stdio)
--host TEXT Host to bind for SSE/HTTP transports (default: 127.0.0.1)
--port INTEGER Port to bind for SSE/HTTP transports (default: 8000)
--mount TEXT Optional mount path for SSE transport (e.g., /mcp)
--readonly Enable read-only mode (prevents INSERT, UPDATE, DELETE, etc.)
--help Show this message and exit
Environment Variables
DATABASE_CONNECTION_STRING: MySQL connection stringDATABASE_READONLY: Set to "true", "1", or "yes" to enable read-only modeDATABASE_STATEMENT_TIMEOUT_MS: Query timeout in millisecondsMCP_TRANSPORT: Transport protocol (stdio, sse, streamable-http)MCP_HOST: Host for network transportsMCP_PORT: Port for network transportsMCP_SSE_MOUNT: Mount path for SSE transport
Read-Only Mode
Read-only mode prevents any write operations (INSERT, UPDATE, DELETE, DROP, etc.) and only allows SELECT, SHOW, WITH, VALUES, and EXPLAIN queries.
Enable via command-line:
python mysql-db-server.py --conn "mysql://user:password@host:port/database" --readonly
Enable via environment variable:
export DATABASE_READONLY="true"
python mysql-db-server.py --conn "mysql://user:password@host:port/database"
Check if read-only mode is enabled:
# Get tools from MCP client
tools = await client.get_tools()
server_info_tool = next((t for t in tools if t.name == "server_info"), None)
if server_info_tool:
result = await server_info_tool.ainvoke({})
print(result["readonly"]) # True if read-only mode is enabled
LangGraph Text-to-SQL Agent
The project includes a sophisticated LangGraph-based agent that converts natural language questions into SQL queries. The agent automatically explores the database schema, generates SQL queries, executes them, and refines queries if errors occur.
Features
- Intelligent Schema Exploration:
- Automatically identifies relevant tables using LLM analysis (skips LLM call for ≤3 tables)
- Only explores tables needed for the query (much faster!)
- Describes table structures with exact column names
- Fetches foreign key relationships for better JOINs
- Caches schema information across queries (session-based)
- Query Validation:
- Validates that user queries are actually database questions before generating SQL
- Uses fast heuristics (set-based keyword matching) first, then LLM only for ambiguous cases
- Rejects gibberish, greetings, and non-database questions early
- Optimized to avoid LLM calls for 90%+ of cases
- Intelligent SQL Generation:
- Uses LLM with chain-of-thought reasoning for step-by-step query generation
- Handles multi-part questions (e.g., "find X and then find Y") with subqueries
- Window Function Support: Automatically uses window functions (ROW_NUMBER(), RANK(), DENSE_RANK(), LAG(), LEAD(), SUM() OVER, etc.) for appropriate scenarios like "top N per group", rankings within groups, comparing rows, running totals, percentiles, moving averages
- Tie Handling: Correctly handles ties in "most/least" queries by returning ALL entities with max/min value
- Validates SQL syntax and auto-fixes simple issues
- Confidence Scoring & Auto-Refinement:
- Calculates confidence scores based on actual query execution results
- Explicitly checks if SQL answers the question (not just syntax correctness)
- Automatically refines queries when confidence is low or errors detected
- Provides detailed analysis and reasoning for every query
- Error Recovery:
- Intelligently parses SQL errors to extract actionable information
- Automatically retries with improved queries when execution fails
- Preserves query structure when fixing simple errors (e.g., column names)
- Performance Optimized:
- Parallel execution of schema exploration operations
- Compiled regex patterns for faster text processing
- Reuses test query results when possible (avoids redundant executions)
- State-based SQL storage (avoids re-extraction)
- Set-based lookups for O(1) table/column checks
- LRU cache management for schema and column caches (prevents unbounded growth)
- Timeout handling for LLM and database calls (prevents hanging)
- Robust Error Handling:
- Input validation (rejects None, empty, or invalid queries)
- Timeout protection for all LLM and database calls (validates timeout > 0)
- Graceful fallbacks when LLM returns empty/malformed responses
- Type safety for all conversions (float, int) with try/except
- Regex group validation (checks groups are not empty before use)
- Cache structure validation (handles corrupted cache gracefully)
- Safe string parsing (handles malformed table resources, error messages)
- Tool call structure validation (validates dict structure before processing)
- Comprehensive logging for debugging (optional, can be disabled)
- MCP Integration: Seamlessly uses MCP tools for database operations
Setup
- Start the MCP Server (in one terminal):
python mysql-db-server.py --conn "mysql://user:password@host:port/database" --transport streamable-http --port 8000
- Use the Agent (in Python/Jupyter):
from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain_openai import ChatOpenAI
from text_to_sql_agent import TextToSQLAgent
# Connect to MCP server
client = MultiServerMCPClient({
"mysql-server": {
"url": "http://localhost:8000/mcp",
"transport": "streamable_http"
}
})
# Initialize LLM
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
# Create agent with optional configuration
agent = TextToSQLAgent(
mcp_client=client,
llm=llm,
max_query_attempts=3, # Maximum retry attempts
llm_timeout=60, # Timeout for LLM calls (seconds)
query_timeout=30, # Timeout for database queries (seconds)
max_schema_cache_size=1000, # Maximum table descriptions to cache
max_column_cache_size=500, # Maximum column name extractions to cache
enable_logging=True # Enable logging for debugging
)
llm = ChatOpenAI(
api_key="your-openai-api-key",
model="gpt-4o-mini",
temperature=0
)
# Create the agent
agent = TextToSQLAgent(
mcp_client=client,
llm=llm,
max_query_attempts=3 # Maximum retry attempts
)
Usage Examples
Basic Query
# Ask a natural language question
result = await agent.query("How many authors are in the database?")
# Get the final answer
answer = agent.get_final_answer(result)
print(answer)
Query with Filtering
# Complex queries with filters
result = await agent.query("Show me all authors born after 1950")
print(agent.get_final_answer(result))
Aggregation Queries
# Statistical queries
result = await agent.query("What is the average birth year of all authors?")
print(agent.get_final_answer(result))
How It Works
The agent follows an optimized workflow:
-
Schema Exploration (with caching and parallelization):
- Lists all available tables (cached after first call)
- Intelligently identifies relevant tables using LLM (skips for ≤3 tables)
- Describes table structures for relevant tables only (parallel execution)
- Fetches foreign key relationships for better JOINs (parallel execution)
- Caches all schema information for subsequent queries
-
Query Validation (before SQL generation):
- Validates that the user query is a valid database question
- Uses fast heuristics (keyword matching) first
- Falls back to LLM validation only for ambiguous cases
- Rejects invalid queries early to avoid unnecessary processing
-
SQL Generation (with confidence scoring):
- Uses LLM with chain-of-thought reasoning for step-by-step generation
- Includes schema, foreign keys, and column names in prompt
- Executes test query (LIMIT 3) to get sample results
- Calculates confidence score and analysis (single LLM call)
- Confidence scoring explicitly checks if SQL answers the question
- Validates SQL syntax and detects critical issues
- Sets refinement flags for edge routing (does not refine directly)
- Stores final SQL in state to avoid re-extraction
-
SQL Refinement (if needed):
- Separate node handles refinement when confidence is low or errors detected
- Fetches missing schema if needed
- Refines SQL using analysis and error context
- Re-executes test query and recalculates confidence
-
Query Execution (optimized to avoid redundancy):
- Uses stored SQL from state (avoids re-extraction)
- Reuses test query results if they contain all data (avoids redundant execution)
- Respects original LIMIT clause when reusing test results (e.g., LIMIT 1 returns 1 row, not all test results)
- Executes full query only when needed
-
Error Recovery (with intelligent parsing):
- Intelligently parses SQL errors (extracts error type, column, table)
- Passes error context to SQL generation (optional refinement)
- Preserves query structure when fixing simple errors
- Automatically retries failed queries (up to
max_query_attempts)
Agent State Graph
The agent uses a LangGraph state machine with the following nodes:
- explore_schema: Discovers and caches database schema (with conditional routing for completion)
- generate_sql: Converts natural language to SQL using LLM, calculates confidence, sets refinement flags
- refine_sql: Refines SQL based on confidence score and analysis (separate node for clarity)
- execute_query: Runs SQL queries via MCP tools (reuses test results when possible)
- refine_query: Improves queries based on error feedback (routes back to generate_sql)
- tools: Handles tool calls for schema exploration
Architecture: The agent follows LangGraph best practices with clear separation:
- Nodes: Process state and return updates (no conditional logic)
- Edges: Make routing decisions based on state flags (all orchestration logic)
For a detailed explanation of the architecture, workflow graph, and how to extend the agent, see AGENT_ARCHITECTURE.md.
Advanced Usage
Custom Configuration
# Run with custom LangGraph config
result = await agent.query(
"Find all authors with more than 5 books",
config={"recursion_limit": 50}
)
Access Full State
# Get complete agent state including all messages
result = await agent.query("Show me the database schema")
# Access messages, schema info, and query attempts
messages = result["messages"]
schema_info = result["schema_info"]
attempts = result["query_attempts"]
Helper Function for Clean Results
from langchain_core.messages import ToolMessage
def get_answer(result):
"""Extract the final answer from agent result"""
messages = result.get("messages", [])
for msg in reversed(messages):
if isinstance(msg, ToolMessage) and "successfully" in msg.content.lower():
return msg.content
return result.get("messages", [])[-1].content if result.get("messages") else "No answer"
# Use it
result = await agent.query("How many tables are in the database?")
print(get_answer(result))
Tips for Best Results
-
Be Specific: Clear, specific questions work best
- ✅ "Show me all authors born after 1950"
- ❌ "authors stuff"
-
Use Table Names: If you know table names, mention them
- ✅ "List all books in the authors table"
- ✅ "How many records are in the authors table?"
-
Specify Filters: Be explicit about filtering criteria
- ✅ "Find authors where birth_year is greater than 1950"
- ✅ "Show me authors with names starting with 'G'"
-
Ask for Aggregations: The agent handles COUNT, SUM, AVG, etc.
- ✅ "What is the average birth year?"
- ✅ "Count the total number of authors"
Limitations
- Currently optimized for SELECT queries (read-only operations)
- Maximum retry attempts are configurable (default: 3)
- Requires OpenAI API key for LLM functionality
- Schema information is cached across queries within the same agent instance for better performance
Improving the Agent
For a comprehensive guide on enhancing the agent's performance, accuracy, and features, see IMPROVEMENTS.md.
✅ Implemented Features:
- Intelligent Table Selection: Uses LLM to identify only relevant tables (skips for ≤3 tables)
- Foreign Key Relationships: Fetches FK info for better JOIN understanding
- Schema Caching: Caches schema information across queries (session-based)
- Chain-of-Thought Reasoning: Step-by-step query generation for better accuracy
- Confidence Scoring: Calculates confidence based on actual query execution results
- Auto-Refinement: Automatically improves queries when issues detected
- Intelligent Error Parsing: Extracts actionable information from error messages
- Performance Optimizations: Compiled regex, parallel execution, result reuse, state-based storage
Planned Improvements:
- Few-Shot Examples: Add example queries to guide better SQL generation patterns
- Query Explanation: Explain what generated SQL queries do
- Query History and Learning: Learn from past successful queries
See IMPROVEMENTS.md for detailed implementation examples and step-by-step instructions.
MySQL Features
- Database-level organization
SHOW DATABASESandSHOW TABLESfor performanceDESCRIBEfor table structureSHOW CREATE TABLEfor detailed table information
Tools
The server provides the following MCP tools:
server_info: Get server and database information (database type: MySQL, readonly status, MySQL connector version)db_identity: Get current database identity details (database type: MySQL, database name, user, host, port, server version)run_query: Execute SQL queries with typed input (returns markdown or JSON string)run_query_json: Execute SQL queries with typed input (returns JSON list)list_table_resources: List tables as MCP resource URIs (table://schema/table) - returns structured listread_table_resource: Read table data (rows) via MCP resource protocol - returns JSON listlist_tables: List tables in a database - returns markdown string (human-readable)describe_table: Get table structure (columns, types, constraints) - returns markdown stringget_foreign_keys: Get foreign key relationships (via SHOW CREATE TABLE)
Calling MCP Tools
MultiServerMCPClient doesn't have a call_tool() method. Instead, you need to:
- Get tools using
get_tools()which returns LangChainStructuredToolobjects - Find the tool by name
- Invoke it using
tool.ainvoke()with the appropriate arguments
Efficient Tool Lookup
For better performance (especially with many tools), create a dictionary for O(1) lookups:
# Get all tools
tools = await client.get_tools()
# Create dictionary for fast O(1) lookup (recommended)
tool_dict = {t.name: t for t in tools}
# Now you can call tools efficiently
server_info = await tool_dict["server_info"].ainvoke({})
tables = await tool_dict["list_tables"].ainvoke({"db_schema": None})
Get Server Information
# Get tools
tools = await client.get_tools()
tool_dict = {t.name: t for t in tools}
# Get server info and connection details
result = await tool_dict["server_info"].ainvoke({})
# Returns: {
# "name": "MySQL Database Explorer",
# "database_type": "MySQL", # Database type is explicitly included
# "readonly": False,
# "mysql_connector_version": "8.0.33"
# }
db_info = await tool_dict["db_identity"].ainvoke({})
# Returns: {
# "database_type": "MySQL", # Database type is explicitly included
# "database": "mydatabase",
# "user": "root@localhost",
# "host": "localhost",
# "port": 3306,
# "server_version": "8.0.33"
# }
List Tables
tools = await client.get_tools()
tool_dict = {t.name: t for t in tools}
# Lists tables in the current database
result = await tool_dict["list_tables"].ainvoke({"db_schema": "mydatabase"})
# Or get tables as MCP resources
resources = await tool_dict["list_table_resources"].ainvoke({"schema": "mydatabase"})
Explore Table Structure
tools = await client.get_tools()
tool_dict = {t.name: t for t in tools}
# Get table structure
result = await tool_dict["describe_table"].ainvoke({
"table_name": "users",
"db_schema": "mydatabase"
})
# Get foreign key relationships
fks = await tool_dict["get_foreign_keys"].ainvoke({
"table_name": "users",
"db_schema": "mydatabase"
})
Execute Query
tools = await client.get_tools()
tool_dict = {t.name: t for t in tools}
# Execute query with markdown output
result = await tool_dict["run_query"].ainvoke({
"input": {
"sql": "SELECT * FROM users LIMIT 10",
"format": "markdown"
}
})
# Execute query with JSON output
result = await tool_dict["run_query_json"].ainvoke({
"input": {
"sql": "SELECT * FROM users LIMIT 10",
"row_limit": 100
}
})
MCP Resource Tools
These tools implement the MCP (Model Context Protocol) resource pattern, which allows tables to be treated as discoverable resources that can be accessed via standardized URIs.
list_table_resources
Lists all tables in a schema and returns them as MCP resource URIs. This enables MCP clients to discover available tables dynamically.
What it does:
- Queries the database to get all table names from the specified schema
- Formats each table as a resource URI:
table://schema/table_name - Returns a list of these URIs
Example:
# Get all tables as resource URIs
tools = await client.get_tools()
tool_dict = {t.name: t for t in tools}
resources = await tool_dict["list_table_resources"].ainvoke({"schema": "mydatabase"})
# Returns: ["table://mydatabase/users", "table://mydatabase/orders", "table://mydatabase/products"]
Use cases:
- Dynamic table discovery in MCP clients
- Building UI that lists available tables
- Integration with MCP resource-aware tools
read_table_resource
Reads data from a specific table using the MCP resource protocol. This provides a standardized way to access table data.
What it does:
- Executes
SELECT * FROM schema.tablewith a row limit - Returns table rows as a list of dictionaries (JSON format)
- Each dictionary represents one row with column names as keys
Example:
# Read table data via MCP resource protocol
tools = await client.get_tools()
tool_dict = {t.name: t for t in tools}
data = await tool_dict["read_table_resource"].ainvoke({
"schema": "mydatabase",
"table": "users",
"row_limit": 50 # Limits number of rows returned
})
# Returns: [
# {"id": 1, "name": "Alice", "email": "alice@example.com"},
# {"id": 2, "name": "Bob", "email": "bob@example.com"},
# ...
# ]
Use cases:
- Quick table previews without writing SQL
- MCP resource-aware clients that can fetch table data by URI
- Data exploration and inspection tools
Key differences from run_query:
read_table_resource: Simple, standardized way to read entire tables (no SQL needed)run_query: Flexible, allows any SQL query with custom WHERE clauses, JOINs, etc.
Comparison: Resource Tools vs Regular Tools
list_table_resources vs list_tables
Both list tables, but serve different purposes:
| Feature | list_table_resources |
list_tables |
|---|---|---|
| Return Type | List[str] (structured data) |
str (markdown text) |
| Format | Resource URIs: ["table://schema/users", ...] |
Human-readable markdown table |
| Use Case | Programmatic access, MCP resource protocol | Human viewing, documentation |
| Integration | Works with MCP resource-aware clients | General purpose, readable output |
Example comparison:
# Get tools once
tools = await client.get_tools()
tool_dict = {t.name: t for t in tools}
# list_table_resources - structured for programs
resources = await tool_dict["list_table_resources"].ainvoke({"schema": "mydb"})
# Returns: ["table://mydb/users", "table://mydb/orders"]
# list_tables - formatted for humans
tables = await tool_dict["list_tables"].ainvoke({"db_schema": "mydb"})
# Returns: "| Tables_in_mydb |\n|-----------------|\n| users |\n| orders |"
read_table_resource vs describe_table
These serve completely different purposes - they're complementary, not redundant:
| Feature | read_table_resource |
describe_table |
|---|---|---|
| What it returns | Table data (rows) | Table structure (schema) |
| Return Type | List[Dict[str, Any]] (JSON) |
str (markdown) |
| SQL Command | SELECT * FROM table |
DESCRIBE table |
| Use Case | View actual data/rows | View column definitions, types, constraints |
| Example Output | [{"id": 1, "name": "Alice"}, ...] |
Column names, types, nullability, keys |
Example comparison:
# Get tools once
tools = await client.get_tools()
tool_dict = {t.name: t for t in tools}
# read_table_resource - get the DATA
data = await tool_dict["read_table_resource"].ainvoke({
"schema": "mydb", "table": "users", "row_limit": 10
})
# Returns: [{"id": 1, "name": "Alice", "email": "alice@example.com"}, ...]
# describe_table - get the STRUCTURE
structure = await tool_dict["describe_table"].ainvoke({
"table_name": "users", "db_schema": "mydb"
})
# Returns: "| Field | Type | Null | Key | Default | Extra |\n|-------|------|------|-----|---------|-------|\n| id | int | NO | PRI | NULL | auto_increment |\n| name | varchar(100) | NO | | NULL | |"
Summary:
- Use
read_table_resourcewhen you want to see the data in a table - Use
describe_tablewhen you want to see the schema/structure of a table - They answer different questions: "What's in the table?" vs "How is the table defined?"
Notes
- MySQL connection strings must start with
mysql:// - The format is:
mysql://user:password@host:port/database - All database names are treated as schemas for compatibility
Documentation
-
AGENT_ARCHITECTURE.md: Detailed architecture explanation, workflow graph, and extension guide
- How the agent works internally
- Visual workflow diagrams
- Step-by-step guides for adding nodes and tools
- Code examples and patterns
-
IMPROVEMENTS.md: Comprehensive improvement guide
- Enhancement ideas with priority rankings
- Implementation examples
- Performance optimizations
- Advanced features
Improving the Agent
For a comprehensive guide on enhancing the agent's performance, accuracy, and features, see IMPROVEMENTS.md.
The improvements guide includes:
- Priority rankings (High/Medium/Low) for each improvement
- Detailed code examples with implementation snippets
- Step-by-step integration instructions
- Workflow graphs showing how the agent evolves with improvements
- Testing strategies and validation approaches
- Benefits analysis for each enhancement
Quick Start: Begin with high-priority items like foreign key relationships, better error parsing, and column name suggestions for immediate impact.
Troubleshooting
Connection Issues
- Ensure the MySQL server is running and accessible
- Check connection string format and credentials
- Verify network connectivity and firewall settings
MySQL-Specific Issues
- Ensure MySQL server supports the connection protocol
- Check user permissions for the specified database
- Verify MySQL connector version compatibility
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
E2B
Using MCP to run code via e2b.