Stata MCP Server
Connects AI agents to a local Stata installation, enabling execution of Stata code, data inspection, graph generation, and result verification through natural language interactions.
README
Stata MCP Server
<a href="https://cursor.com/en-US/install-mcp?name=mcp-stata&config=eyJjb21tYW5kIjoidXZ4IC0tZnJvbSBtY3Atc3RhdGEgbWNwLXN0YXRhIn0%3D"><img src="https://cursor.com/deeplink/mcp-install-dark.svg" alt="Install MCP Server" height="20"></a> <a href="https://pypi.org/project/mcp-stata/"><img src="https://img.shields.io/pypi/v/mcp-stata?style=flat&color=black" alt="PyPI - Version" height="20"></a>
A Model Context Protocol (MCP) server that connects AI agents to a local Stata installation.
If you'd like a fully integrated VS Code extension to run Stata code without leaving your IDE, and also allow AI agent interaction, check out my other project: <img src="https://raw.githubusercontent.com/tmonk/stata-workbench/refs/heads/main/img/icon.png" height="12px"> Stata Workbench.
Built by <a href="https://tdmonk.com">Thomas Monk</a>, London School of Economics. <!-- mcp-name: io.github.tmonk/mcp-stata -->
This server enables LLMs to:
- Execute Stata code: run any Stata command (e.g.
sysuse auto,regress price mpg). - Inspect data: retrieve dataset summaries and variable codebooks.
- Export graphics: generate and view Stata graphs (histograms, scatterplots).
- Streaming graph caching: automatically cache graphs during command execution for instant exports.
- Verify results: programmatically check stored results (
r(),e()) for accurate validation.
Prerequisites
- Stata 17+ (required for
pystataintegration) - Python 3.12+ (required)
- uv (recommended for install/run)
Installation
Run as a published tool with uvx
uvx --refresh --from mcp-stata@latest mcp-stata
uvx is an alias for uv tool run and runs the tool in an isolated, cached environment.
Configuration
This server attempts to automatically discover your Stata installation (supporting standard paths and StataNow).
If auto-discovery fails, set the STATA_PATH environment variable to your Stata executable:
# macOS example
export STATA_PATH="/Applications/StataNow/StataMP.app/Contents/MacOS/stata-mp"
# Windows example (cmd.exe)
set STATA_PATH="C:\Program Files\Stata18\StataMP-64.exe"
If you prefer, add STATA_PATH to your MCP config's env for any IDE shown below. It's optional and only needed when discovery cannot find Stata.
Optional env example (add inside your MCP server entry):
"env": {
"STATA_PATH": "/Applications/StataNow/StataMP.app/Contents/MacOS/stata-mp"
}
IDE Setup (MCP)
This MCP server uses the stdio transport (the IDE launches the process and communicates over stdin/stdout).
Claude Desktop
Open Claude Desktop → Settings → Developer → Edit Config. Config file locations include:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
Published tool (uvx)
{
"mcpServers": {
"mcp-stata": {
"command": "uvx",
"args": [
"--refresh",
"--from",
"mcp-stata@latest",
"mcp-stata"
]
}
}
}
After editing, fully quit and restart Claude Desktop to reload MCP servers.
Cursor
Cursor supports MCP config at:
- Global:
~/.cursor/mcp.json - Project:
.cursor/mcp.json
Published tool (uvx)
{
"mcpServers": {
"mcp-stata": {
"command": "uvx",
"args": [
"--refresh",
"--from",
"mcp-stata@latest",
"mcp-stata"
]
}
}
}
Windsurf
Windsurf supports MCP plugins and also allows manual editing of mcp_config.json. After adding/editing a server, use the UI’s refresh so it re-reads the config.
A common location is ~/.codeium/windsurf/mcp_config.json.
Published tool (uvx)
{
"mcpServers": {
"mcp-stata": {
"command": "uvx",
"args": [
"--refresh",
"--from",
"mcp-stata@latest",
"mcp-stata"
]
}
}
}
Google Antigravity
In Antigravity, MCP servers are managed from the MCP store/menu; you can open Manage MCP Servers and then View raw config to edit mcp_config.json.
Published tool (uvx)
{
"mcpServers": {
"mcp-stata": {
"command": "uvx",
"args": [
"--refresh",
"--from",
"mcp-stata@latest",
"mcp-stata"
]
}
}
}
Visual Studio Code
VS Code supports MCP servers via a .vscode/mcp.json file. The top-level key is servers (not mcpServers).
Create .vscode/mcp.json:
Published tool (uvx)
{
"servers": {
"mcp-stata": {
"type": "stdio",
"command": "uvx",
"args": [
"--refresh",
"--from",
"mcp-stata@latest",
"mcp-stata"
]
}
}
}
VS Code documents .vscode/mcp.json and the servers schema, including type and command/args.
Skills
- Skill file (for Claude/Codex): skill/SKILL.md
Tools Available (from server.py)
run_command(code, echo=True, as_json=True, trace=False, raw=False, max_output_lines=None): Execute Stata syntax.- Always writes output to a temporary log file and emits a single
notifications/logMessagecontaining{"event":"log_path","path":"..."}so the client can tail it locally. - May emit
notifications/progresswhen the client provides a progress token/callback.
- Always writes output to a temporary log file and emits a single
read_log(path, offset=0, max_bytes=65536): Read a slice of a previously-provided log file (JSON:path,offset,next_offset,data).find_in_log(path, query, start_offset=0, max_bytes=5_000_000, before=2, after=2, case_sensitive=False, regex=False, max_matches=50): Search a log file for text and return context windows.- Returns JSON with
matches(context lines, line indices),next_offset, andtruncatedifmax_matchesis hit. - Supports literal or regex search with bounded read window for large logs.
- Returns JSON with
load_data(source, clear=True, as_json=True, raw=False, max_output_lines=None): Heuristic loader (sysuse/webuse/use/path/URL) with JSON envelope unlessraw=True. Supports output truncation.get_data(start=0, count=50): View dataset rows (JSON response, capped to 500 rows).get_ui_channel(): Return a short-lived localhost HTTP endpoint + bearer token for the UI-only data browser.describe(): View dataset structure via Statadescribe.list_graphs(): See available graphs in memory (JSON list with anactiveflag).export_graph(graph_name=None, format="pdf"): Export a graph to a file path (default PDF; useformat="png"for PNG).export_graphs_all(): Export all in-memory graphs. Returns file paths.get_help(topic, plain_text=False): Markdown-rendered Stata help by default;plain_text=Truestrips formatting.codebook(variable, as_json=True, trace=False, raw=False, max_output_lines=None): Variable-level metadata (JSON envelope by default; supportstrace=Trueand output truncation).run_do_file(path, echo=True, as_json=True, trace=False, raw=False, max_output_lines=None): Execute a .do file.- Always writes output to a temporary log file and emits a single
notifications/logMessagecontaining{"event":"log_path","path":"..."}so the client can tail it locally. - Emits incremental
notifications/progresswhen the client provides a progress token/callback.
- Always writes output to a temporary log file and emits a single
get_stored_results(): Getr()ande()scalars/macros as JSON.get_variable_list(): JSON list of variables and labels.
Cancellation
- Clients may cancel an in-flight request by sending the MCP notification
notifications/cancelledwithparams.requestIdset to the original tool call ID. - Client guidance:
- Pass a
_meta.progressTokenwhen invoking the tool if you want progress updates (optional). - If you need to cancel, send
notifications/cancelledwith the same requestId. You may also stop tailing the log file path once you receive cancellation confirmation (the tool call will return an error indicating cancellation). - Be prepared for partial output in the log file; cancellation is best-effort and depends on Stata surfacing
BreakError.
- Pass a
Resources exposed for MCP clients:
stata://data/summary→summarizestata://data/metadata→describestata://graphs/list→ graph list (resource handler delegates tolist_graphstool)stata://variables/list→ variable list (resource wrapper)stata://results/stored→ stored r()/e() results
UI-only Data Browser (Local HTTP API)
This server also hosts a localhost-only HTTP API intended for a VS Code extension UI to browse data at high volume (paging, filtering) without sending large payloads over MCP.
Important properties:
- Loopback only: binds to
127.0.0.1. - Bearer auth: every request requires an
Authorization: Bearer <token>header. - Short-lived tokens: clients should call
get_ui_channel()to obtain a fresh token as needed. - No Stata dataset mutation for browsing/filtering:
- No generated variables.
- Paging uses
sfi.Data.get. - Filtering is evaluated in Python over chunked reads.
Discovery via MCP (get_ui_channel)
Call the MCP tool get_ui_channel() and parse the JSON:
{
"baseUrl": "http://127.0.0.1:53741",
"token": "...",
"expiresAt": 1730000000,
"capabilities": {
"dataBrowser": true,
"filtering": true,
"sorting": true,
"arrowStream": true
}
}
Server-enforced limits (current defaults):
- maxLimit: 500
- maxVars: 32,767
- maxChars: 500
- maxRequestBytes: 1,000,000
- maxArrowLimit: 1,000,000 (specific to
/v1/arrow)
Endpoints
All endpoints are under baseUrl and require the bearer token.
GET /v1/dataset- Returns dataset identity and basic state (
id,frame,n,k).
- Returns dataset identity and basic state (
GET /v1/vars- Returns variable metadata (
name,type,label,format).
- Returns variable metadata (
POST /v1/page- Returns a page of data for selected variables.
POST /v1/arrow- Returns a binary Arrow IPC stream (same input as
/v1/page).
- Returns a binary Arrow IPC stream (same input as
POST /v1/views- Creates a server-side filtered view (handle-based filtering).
POST /v1/views/:viewId/page- Pages within a filtered view.
POST /v1/views/:viewId/arrow- Returns a binary Arrow IPC stream from a filtered view.
DELETE /v1/views/:viewId- Deletes a view handle.
POST /v1/filters/validate- Validates a filter expression.
Paging request example
curl -sS \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"datasetId":"...","frame":"default","offset":0,"limit":50,"vars":["price","mpg"],"includeObsNo":true,"maxChars":200}' \
"$BASE_URL/v1/page"
Sorting
The /v1/page and /v1/views/:viewId/page endpoints support sorting via the optional sortBy parameter:
# Sort by price ascending
curl -sS \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"datasetId":"...","offset":0,"limit":50,"vars":["price","mpg"],"sortBy":["price"]}' \
"$BASE_URL/v1/page"
# Sort by price descending
curl -sS \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"datasetId":"...","offset":0,"limit":50,"vars":["price","mpg"],"sortBy":["-price"]}' \
"$BASE_URL/v1/page"
# Multi-variable sort: foreign ascending, then price descending
curl -sS \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"datasetId":"...","offset":0,"limit":50,"vars":["foreign","price","mpg"],"sortBy":["foreign","-price"]}' \
"$BASE_URL/v1/page"
Sort specification format:
sortByis an array of strings (variable names with optional prefix)- No prefix or
+prefix = ascending order (e.g.,"price"or"+price") -prefix = descending order (e.g.,"-price")- Multiple variables are supported for multi-level sorting
- Uses Stata's
gsortcommand internally
Sorting with filtered views:
- Sorting is fully supported with filtered views
- The sort is applied to the entire dataset, then filtered indices are re-computed
- Example: Filter for
price < 5000, then sort descending by price
# Create a filtered view
curl -sS \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"datasetId":"...","frame":"default","filterExpr":"price < 5000"}' \
"$BASE_URL/v1/views"
# Returns: {"view": {"id": "view_abc123", "filteredN": 37}}
# Get sorted page from filtered view
curl -sS \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"offset":0,"limit":50,"vars":["price","mpg"],"sortBy":["-price"]}' \
"$BASE_URL/v1/views/view_abc123/page"
Notes:
datasetIdis used for cache invalidation. If the dataset changes due to running Stata commands, the server will report a new dataset id and view handles become invalid.- Filter expressions are evaluated in Python using values read from Stata via
sfi.Data.get. Use boolean operators like==,!=,<,>, andand/or(Stata-style&/|are also accepted). - Sorting modifies the dataset order in memory using
gsort. When combined with views, the filtered indices are automatically re-computed after sorting.
License
This project is licensed under the GNU Affero General Public License v3.0 or later. See the LICENSE file for the full text.
Error reporting
- All tools that execute Stata commands support JSON envelopes (
as_json=true) carrying:rc(from r()/c(rc)),stdout,stderr,message, optionalline(when Stata reports it),command, optionallog_path(for log-file streaming), and asnippetexcerpt of error output.
- Stata-specific cues are preserved:
r(XXX)codes are parsed when present in output.- “Red text” is captured via stderr where available.
trace=trueaddsset trace onaround the command/do-file to surface program-defined errors; the trace is turned off afterward.
Logging
Set MCP_STATA_LOGLEVEL (e.g., DEBUG, INFO) to control server logging. Logs include discovery details (edition/path) and command-init traces for easier troubleshooting.
Development & Contributing
For detailed information on building, testing, and contributing to this project, see CONTRIBUTING.md.
Quick setup:
# Install dependencies
uv sync --extra dev --no-install-project
# Run tests (requires Stata)
pytest
# Run tests without Stata
pytest -v -m "not requires_stata"
# Build the package
python -m build
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
E2B
Using MCP to run code via e2b.