StatFlow
Enables AI assistants to perform statistical analysis on MySQL databases, generating formatted Excel reports with t-tests and effect sizes, and creating thesis-quality Word documents with AI-powered insights.
README
StatFlow
🎯 About This Project
StatFlow is a personal learning project built to understand and explore the Model Context Protocol (MCP). This project demonstrates how to build an MCP server that provides AI assistants with tools to interact with databases and generate reports.
Project Purpose: Learn MCP architecture, implement MCP servers, and understand how to expose functionality to AI assistants through standardized protocols.
🔌 What is MCP?
Model Context Protocol (MCP) is an open protocol that enables AI assistants to securely access external tools and data sources. It provides a standardized way for AI applications to:
- Call Tools: Execute functions or operations (like database queries, file operations)
- Access Resources: Read-only access to data (like database tables, file contents)
- Interact Securely: Controlled access to external systems without exposing credentials
Why MCP?
- Standardized interface for AI-tool integration
- Secure and controlled access to resources
- Works with Claude Desktop, Cursor, and other MCP-compatible clients
- Enables AI assistants to perform complex workflows autonomously
📊 Project Overview
StatFlow - MCP Server for Statistical Analysis & Report Generation
This MCP server demonstrates how to expose database analysis capabilities through MCP tools. It provides AI assistants with the ability to:
- Extract data from multiple MySQL databases
- Generate statistical analysis tables (t-tests, effect sizes, p-values)
- Create formatted Excel reports with visual organization
- Generate thesis-quality Word documents with AI-powered insights
- Support unlimited databases through dynamic configuration
✨ MCP Server Features
MCP Tools (3 Tools)
StatFlow exposes three MCP tools that AI assistants can call:
-
run_complete_analysis🎯- Complete workflow (DB → Excel → Report)
- Handles entire analysis pipeline
- Returns success status and file paths
-
generate_analysis_excel📊- Database → Excel only
- Fetches data and creates analysis tables
- Returns Excel file path
-
generate_thesis_report📚- Excel → Thesis-quality report
- Generates AI-powered Word document
- Uses OpenAI for content generation
MCP Resources (1 Resource)
experimental_data📦- Read-only access to database participant data
- Returns JSON data without modifying database
- Demonstrates MCP resource pattern
Key MCP Concepts Demonstrated
- Tool Implementation: How to create MCP tools with parameters and return values
- Resource Pattern: Read-only data access without side effects
- Server Setup: Standard I/O communication with MCP clients
- Error Handling: Proper error responses in MCP format
- Dynamic Configuration: Loading database configs at runtime
Additional Features
- AI-Powered Report Generation: Customizable writing style and terminology
- Comprehensive Analysis: Statistical tables with t-tests, p-values, effect sizes
- Flexible Architecture: Support for unlimited databases without code changes
- Modular Design: Reusable query and analysis modules
🚀 Quick Start
Installation
# Clone the repository
git clone <repository-url>
cd statflow
# Install dependencies
pip install -r requirements.txt
MCP Server Setup
To use StatFlow as an MCP server with Cursor or Claude Desktop:
- Configure MCP Client (e.g.,
~/.cursor/mcp.json):
{
"mcpServers": {
"statflow": {
"command": "python",
"args": ["-m", "statflow.server"],
"cwd": "/path/to/statflow",
"env": {
"PYTHONPATH": "/path/to/statflow/src"
}
}
}
}
-
Restart your MCP client (Cursor/Claude Desktop)
-
Use with AI: Ask your AI assistant to use StatFlow tools, e.g., "Run complete analysis using StatFlow"
Configuration
Edit config.json (this file is not tracked in git - create your own):
{
"mysql_dump": {
"host": "localhost",
"port": 3306,
"user": "root",
"password": "",
"database": "your_database",
"prefix": "L1_"
},
"mysql_dump_2": {
"host": "localhost",
"port": 3306,
"user": "root",
"password": "",
"database": "your_database_2",
"prefix": "L2_"
},
"excel_output": {
"default_path": "./results"
},
"openai": {
"api_key": "your-api-key-here",
"enabled": true,
"model": "gpt-4o-mini"
}
}
Note: You can add unlimited databases (mysql_dump_3, mysql_dump_4, etc.) - StatFlow will automatically detect and use them.
📊 Usage
Option 1: Via MCP Server (Recommended)
Once configured, use StatFlow through your MCP-compatible AI assistant:
"Use StatFlow to run complete analysis"
"Generate analysis Excel using StatFlow"
"Create thesis report from Excel using StatFlow"
The AI assistant will call the appropriate MCP tools automatically.
Option 2: Direct Script Execution
Generate Excel file directly:
python run_analysis.py
Output:
- ✅ Excel file with comprehensive analysis tables
- ✅ Statistical analysis tables (t-tests, averages, summaries)
- ✅ Color-coded sections for easy navigation
Option 3: Programmatic Usage
from statflow.server import (
run_complete_analysis_workflow,
generate_analysis_excel_only,
generate_thesis_report_internal
)
# Run complete workflow
result = run_complete_analysis_workflow("config.json", generate_report=True)
# Or step by step
excel_result = generate_analysis_excel_only("config.json")
report_result = generate_thesis_report_internal(excel_result["output_path"], config)
🔧 MCP Server Architecture
MCP Server Implementation
The server (src/statflow/server.py) implements:
- MCP Server Class: Uses
mcp.server.Serverfrom the MCP Python SDK - Tool Handlers: Async functions that implement MCP tool logic
- Resource Handlers: Read-only data access patterns
- Standard I/O: Communication via stdio with MCP clients
MCP Tool Structure
Each tool follows the MCP pattern:
@server.list_tools()
async def list_tools() -> list[Tool]:
"""List available MCP tools."""
return [
Tool(
name="tool_name",
description="What the tool does",
inputSchema={
"type": "object",
"properties": {
"param": {"type": "string", "description": "Parameter description"}
}
}
)
]
@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
"""Handle tool execution."""
# Tool implementation
return [TextContent(type="text", text=result)]
Key MCP Patterns Used
- Tool Discovery:
@server.list_tools()decorator - Tool Execution:
@server.call_tool()decorator - Resource Access:
@server.list_resources()and@server.read_resource() - Error Handling: Proper error responses in MCP format
- Type Safety: Using MCP type definitions (
Tool,Resource,TextContent)
📖 Report Structure
The generated thesis report includes customizable sections. By default, it generates comprehensive analysis sections:
-
Time Analysis (~600-900 words)
- Calculation methodology
- Comparison across experimental conditions
- Statistical significance testing
- Overall patterns
-
Accuracy Analysis (~600-900 words)
- Accuracy computation method
- Performance comparisons
- T-test results and interpretations
- Key findings
-
Satisfaction Analysis (~600-900 words)
- Satisfaction scoring methodology
- Preference patterns
- Statistical analysis
- User experience insights
-
Group Comparison Analysis (~900-1200 words)
- Performance by participant groups
- Statistical differences
- Comparative insights
- Recommendations by group type
-
Overall Summary and Key Findings (~600-900 words)
- Research question results
- Key findings synthesized
- Practical recommendations
- Future directions
Total: ~3,000-5,500 words
Note: Section names and content are fully customizable via the prompts configuration file.
🔧 Customization
Main Configuration File
Edit: src/statflow/analysis/thesis_quality_prompts.py
This file contains all AI instructions in plain English. You can:
- Adjust word counts
- Change writing style
- Add custom instructions
- Modify section structure
- Update statistical reporting format
Example Customization
To change word count, edit line 32:
LENGTH: 600-900 words per section (concise and focused)
# Change to:
LENGTH: 800-1000 words per section
📁 Project Structure
statflow/
├── config.json # Configuration (not in git)
├── run_analysis.py # Main analysis script
├── requirements.txt # Dependencies
│
├── src/statflow/
│ ├── server.py # MCP server (3 tools)
│ ├── query_builder.py # Database queries
│ │
│ ├── analysis/
│ │ ├── thesis_quality_prompts.py # ⭐ CUSTOMIZE HERE
│ │ ├── thesis_report_generator.py # Report engine
│ │ ├── ai_insights.py # AI analysis
│ │ ├── statistical_analysis.py # Statistics
│ │ └── table_generators.py # Excel tables
│ │
│ └── queries/
│ ├── time_scores.py # Time analysis
│ ├── accuracy_scores.py # Accuracy analysis
│ ├── satisfaction_scores.py # Satisfaction analysis
│ └── graph_questions.py # Graph questions
📊 Output Files
Files are generated in the path specified in config.json (default: ./results)
| File | Description |
|---|---|
experiment_analysis.xlsx |
Comprehensive analysis tables with statistics |
experiment_analysis_THESIS_QUALITY_Report.docx |
3,000-5,500 word thesis-quality report |
🔍 Excel File Contents
The Excel file includes:
Main Data Sheet
- Participant/experimental unit data
- Color-coded sections: User characteristics, Performance metrics, Satisfaction scores
- AVERAGE row with summary statistics
- Organized by experimental conditions/groups
Statistical Analysis Tables
- T-Test tables: Comparative analysis across conditions
- Average metrics: Performance comparisons by groups/categories
- Overall summaries: Statistical comparisons
- P-values and significance levels
AI Insights Sheet
- Automated insights from AI analysis
- Pattern identification
- Data-driven recommendations
🛠️ Requirements
- Python: 3.8+
- MySQL: Database server
- OpenAI API: For thesis report generation (gpt-4o-mini)
- Dependencies: Listed in
requirements.txt
Key Dependencies
mysql-connector-python
openpyxl
pandas
python-docx
openai
mcp
📚 Documentation
- Query Modules: See
src/statflow/queries/README.mdfor details on creating custom analysis modules - Customization: Edit
src/statflow/analysis/thesis_quality_prompts.pyto customize report style and terminology - MCP Server: Use the StatFlow MCP server tools for programmatic access
🔬 Example Use Cases
StatFlow can be used for various experimental data analysis scenarios:
- User Studies: Compare performance across different interfaces, conditions, or user groups
- A/B Testing: Analyze results from experimental and control groups
- Longitudinal Studies: Track changes over time across multiple measurement points
- Comparative Analysis: Evaluate differences between multiple experimental conditions
Customization for Your Study
You can fully customize StatFlow for your specific research:
- Update query modules in
src/statflow/queries/to match your data structure - Modify analysis prompts in
src/statflow/analysis/thesis_quality_prompts.pyto use your terminology - Adjust statistical analysis parameters to match your research design
🎉 Key Benefits
- Automation: Complete workflow from database to publication-ready reports
- Flexibility: Customizable analysis modules and report structure
- Scalability: Support for unlimited databases without code changes
- Efficiency: Automated generation in minutes instead of hours
- Quality: Thesis-level academic writing with AI-powered insights
- Reproducibility: Consistent analysis pipeline for all your studies
📞 Support
For questions or issues:
- Review
src/statflow/queries/README.mdfor query module documentation - Check
src/statflow/analysis/thesis_quality_prompts.pyfor report customization - Examine
config.jsonfor configuration options
📄 License
See LICENSE file for details.
🎓 Learning Resources
MCP Documentation
- MCP Specification: Model Context Protocol
- MCP Python SDK: mcp Python Package
- MCP Examples: Check the MCP repository for more examples
Key Learnings from This Project
This project demonstrates:
- ✅ How to structure an MCP server
- ✅ Implementing MCP tools with complex workflows
- ✅ Using MCP resources for read-only data access
- ✅ Error handling and validation in MCP servers
- ✅ Dynamic tool/resource discovery
- ✅ Integrating MCP servers with existing Python codebases
Example Use Case: CSU SSD Study
The codebase includes an example implementation customized for a research study (the "Improving the CSU Student Success Dashboard and Its Analysis" study). This demonstrates how StatFlow can be adapted for domain-specific needs while maintaining a flexible MCP architecture.
Note: This is a personal learning project, not affiliated with any institution.
📝 Project Status
Project Type: Personal Learning Project
Purpose: Learning and exploring Model Context Protocol (MCP)
Status: ✅ Active and fully functional
Last Updated: November 12, 2025
Version: 2.0 (Renamed to StatFlow)
Created by: Rucha D. Nandgirikar
Note: This is a personal project for learning MCP, not affiliated with any institution or organization.
👤 Author & Resources
Author: Rucha D. Nandgirikar
📚 Related Articles
- 📖 Medium Article - Coming Soon - Learn about building MCP servers with StatFlow
More articles and resources coming soon...
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.