MCP Servers

Scalene-MCP

A FastMCP server that provides LLMs with structured access to Scalene's CPU, GPU, and memory profiling for Python applications. It enables automated performance analysis, bottleneck identification, and optimization suggestions through natural language interactions in supported IDEs.

README

Scalene-MCP

A FastMCP v2 server providing LLMs with structured access to Scalene's comprehensive CPU, GPU, and memory profiling capabilities for Python packages and C/C++ bindings.

Installation

Prerequisites

Python 3.10+
uv (recommended) or pip

From Source

git clone https://github.com/plasma-umass/scalene-mcp.git
cd scalene-mcp
uv venv
uv sync

As a Package

pip install scalene-mcp

Quick Start: Running the Server

Development Mode

# Using uv
uv run scalene_mcp.server

# Using pip
python -m scalene_mcp.server

Production Mode

python -m scalene_mcp.server

🎯 Native Integration with LLM Agents

Works seamlessly with:

✅ GitHub Copilot - Direct integration
✅ Claude Code - Claude Code and Claude VSCode extension
✅ Cursor - All-in-one IDE
✅ Any MCP-compatible LLM client

Zero-Friction Setup (3 Steps)

Install
```
pip install scalene-mcp
```

Configure - Choose one method:

Automated (Recommended):

python scripts/setup_vscode.py

Interactive setup script auto-finds your editor and configures it.

Manual - GitHub Copilot:

// .vscode/settings.json
{
  "github.copilot.chat.mcp.servers": {
    "scalene": {
      "command": "uv",
      "args": ["run", "-m", "scalene_mcp.server"]
    }
  }
}

Manual - Claude Code / Cursor: See editor-specific setup guides

Restart VSCode/Cursor and start profiling!

Start Profiling Immediately

Open any Python project and ask your LLM:

"Profile main.py and show me the bottlenecks"

The LLM automatically:

🔍 Detects your project structure
📄 Finds and profiles your code
📊 Analyzes CPU, memory, GPU usage
💡 Suggests optimizations

No path thinking. No manual configuration. Zero friction.

📚 Editor-Specific Setup:

📚 Full docs: SETUP_VSCODE.md | QUICKSTART.md | TOOLS_REFERENCE.md

Available Serving Methods (FastMCP)

Scalene-MCP can be served in multiple ways using FastMCP's built-in serving capabilities:

1. Standard Server (Default)

# Starts an MCP-compatible server on stdio
python -m scalene_mcp.server

2. With Claude Desktop

Configure in your claude_desktop_config.json:

{
  "mcpServers": {
    "scalene": {
      "command": "python",
      "args": ["-m", "scalene_mcp.server"]
    }
  }
}

Then restart Claude Desktop.

3. With HTTP/SSE Endpoint

# If using fastmcp with HTTP support
uv run --help  # Check FastMCP documentation for HTTP serving

4. With Environment Variables

# Configure via environment
export SCALENE_PYTHON_EXECUTABLE=python3.11
export SCALENE_TIMEOUT=30
python -m scalene_mcp.server

5. Programmatically

from fastmcp import Server

# Create and run server programmatically
server = create_scalene_server()
# Configure and start...

Programmatic Usage

Use Scalene-MCP directly in your Python code:

from scalene_mcp.profiler import ScaleneProfiler
import asyncio

async def main():
    profiler = ScaleneProfiler()
    
    # Profile a script
    result = await profiler.profile(
        type="script",
        script_path="fibonacci.py",
        include_memory=True,
        include_gpu=False
    )
    
    print(f"Profile ID: {result['profile_id']}")
    print(f"Peak memory: {result['summary'].get('total_memory_mb', 'N/A')}MB")
    
asyncio.run(main())

Overview

Scalene-MCP transforms Scalene's powerful profiling output into an LLM-friendly format through a clean, minimal set of well-designed tools. Get detailed performance insights without images or excessive context overhead.

What Scalene-MCP Does

✅ Profile Python scripts with full Scalene feature set
✅ Analyze profiles for hotspots, bottlenecks, memory leaks
✅ Compare profiles to detect regressions
✅ Pass arguments to profiled scripts
✅ Structured output in JSON format for LLMs
✅ Async execution for non-blocking profiling

What Scalene-MCP Doesn't Do

❌ In-process profiling (Scalene.start()/stop()) - uses subprocess instead for isolation
❌ Process attachment (--pid based profiling) - profiles scripts, not running processes
❌ Single-function profiling - designed for complete script analysis

Note: The subprocess-based approach was chosen for reliability and simplicity. LLM workflows typically profile complete scripts, which is a perfect fit. See SCALENE_MODES_ANALYSIS.md for detailed scope analysis.

Key Features

Complete CPU profiling: Line-by-line Python/C time, system time, CPU utilization
Memory profiling: Peak/average memory per line, leak detection with velocity metrics
GPU profiling: NVIDIA and Apple GPU support with per-line attribution
Advanced analysis: Stack traces, bottleneck identification, performance recommendations
Profile comparison: Track performance changes across runs
LLM-optimized: Structured JSON output, summaries before details, context-aware formatting

Available Tools (7 Consolidated Tools)

Scalene-MCP provides a clean, LLM-optimized set of 7 tools:

Discovery (3 tools)

get_project_root() - Auto-detect project structure
list_project_files(pattern, max_depth) - Find files by glob pattern
set_project_context(project_root) - Override auto-detection

Profiling (1 unified tool)

profile(type, script_path/code, ...) - Profile scripts or code snippets
- type="script" for script profiling
- type="code" for code snippet profiling

Analysis (1 mega tool)

analyze(profile_id, metric_type, ...) - 9 analysis modes in one tool:
- metric_type="all" - Comprehensive analysis
- metric_type="cpu" - CPU hotspots
- metric_type="memory" - Memory hotspots
- metric_type="gpu" - GPU hotspots
- metric_type="bottlenecks" - Performance bottlenecks
- metric_type="leaks" - Memory leak detection
- metric_type="file" - File-level metrics
- metric_type="functions" - Function-level metrics
- metric_type="recommendations" - Optimization suggestions

Comparison & Storage (2 tools)

compare_profiles(before_id, after_id) - Compare two profiles
list_profiles() - View all captured profiles

Full reference: See TOOLS_REFERENCE.md

Configuration

Profiling Options

The unified profile() tool supports these options:

Option	Type	Default	Description
`type`	str	required	"script" or "code"
`script_path`	str	None	Required if type="script"
`code`	str	None	Required if type="code"
`include_memory`	bool	true	Profile memory
`include_gpu`	bool	false	Profile GPU usage
`cpu_only`	bool	false	Skip memory/GPU profiling
`reduced_profile`	bool	false	Only report high-activity lines
`cpu_percent_threshold`	float	1.0	Minimum CPU% to report
`malloc_threshold`	int	100	Minimum allocation size (bytes)
`profile_only`	str	""	Profile only paths containing this
`profile_exclude`	str	""	Exclude paths containing this
`use_virtual_time`	bool	false	Use virtual time instead of wall time
`script_args`	list	[]	Command-line arguments for the script

Environment Variables

SCALENE_CPU_PERCENT_THRESHOLD: Override default CPU threshold
SCALENE_MALLOC_THRESHOLD: Override default malloc threshold

Architecture

Components

ScaleneProfiler: Async wrapper around Scalene CLI
ProfileParser: Converts Scalene JSON to structured models
ProfileAnalyzer: Extracts insights and hotspots
ProfileComparator: Compares profiles for regressions
FastMCP Server: Exposes tools via MCP protocol

Data Flow

Python Script
    ↓
ScaleneProfiler (subprocess)
    ↓
Scalene CLI (--json)
    ↓
Temp JSON File
    ↓
ProfileParser
    ↓
Pydantic Models (ProfileResult)
    ↓
Analyzer / Comparator
    ↓
MCP Tools
    ↓
LLM Client

Troubleshooting

GPU Permission Error

If you see PermissionError when profiling with GPU:

# Disable GPU profiling in test environments
result = await profiler.profile(
    type="script",
    script_path="script.py",
    include_gpu=False
)

Profile Not Found

Profiles are stored in memory during the server session. For persistence, implement the storage interface.

Timeout Issues

Adjust the timeout parameter (if using profiler directly):

result = await profiler.profile(
    type="script",
    script_path="slow_script.py"
)

Development

Running Tests

# All tests with coverage
uv run pytest -v --cov=src/scalene_mcp

# Specific test file
uv run pytest tests/test_profiler.py -v

# With coverage report
uv run pytest --cov=src/scalene_mcp --cov-report=html

Code Quality

# Type checking
uv run mypy src/

# Linting
uv run ruff check src/

# Formatting
uv run ruff format src/

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Add tests for new functionality
Ensure all tests pass and coverage ≥ 85%
Submit a pull request

License

MIT License - see LICENSE file for details.

Citation

If you use Scalene-MCP in research, please cite both this project and Scalene:

@software{scalene_mcp,
  title={Scalene-MCP: LLM-Friendly Profiling Server},
  year={2026}
}

@inproceedings{berger2020scalene,
  title={Scalene: Scripting-Language Aware Profiling for Python},
  author={Berger, Emery},
  year={2020}
}

Support

Issues: GitHub Issues for bug reports and feature requests
Discussions: GitHub Discussions for questions and ideas
Documentation: See docs/ directory

Made with ❤️ for the Python performance community.

Manual Installation

pip install -e .

Development

Prerequisites

Python 3.10+
uv (recommended) or pip

Setup

# Install dependencies
uv sync

# Run tests
just test

# Run tests with coverage
just test-cov

# Lint and format
just lint
just format

# Type check
just typecheck

# Full build (sync + lint + typecheck + test)
just build

Project Structure

scalene-mcp/
├── src/scalene_mcp/     # Main package
│   ├── server.py        # FastMCP server with tools/resources/prompts
│   ├── models.py        # Pydantic data models
│   ├── profiler.py      # Scalene execution wrapper
│   ├── parser.py        # JSON output parser
│   ├── analyzer.py      # Analysis engine
│   ├── comparator.py    # Profile comparison
│   ├── recommender.py   # Optimization recommendations
│   ├── storage.py       # Profile persistence
│   └── utils.py         # Shared utilities
├── tests/               # Test suite (100% coverage goal)
│   ├── fixtures/        # Test data
│   │   ├── profiles/    # Sample profile outputs
│   │   └── scripts/     # Test Python scripts
│   └── conftest.py      # Shared test fixtures
├── examples/            # Usage examples
├── docs/                # Documentation
├── pyproject.toml       # Project configuration
├── justfile             # Task runner commands
└── README.md            # This file

Usage

Running the Server

# Development mode with auto-reload
fastmcp dev src/scalene_mcp/server.py

# Production mode
fastmcp run src/scalene_mcp/server.py

# Install to MCP config
fastmcp install src/scalene_mcp/server.py

Example: Profile a Script

# Through MCP client
result = await client.call_tool(
    "profile",
    arguments={
        "script_path": "my_script.py",
        "cpu": True,
        "memory": True,
        "gpu": False,
    }
)

Example: Analyze Results

# Get analysis and recommendations
analysis = await client.call_tool(
    "analyze",
    arguments={"profile_id": result["profile_id"]}
)

Testing

The project maintains 100% test coverage with comprehensive test suites:

# Run all tests
uv run pytest

# Run with coverage report
uv run pytest --cov=src --cov-report=html

# Run specific test file
uv run pytest tests/test_server.py

# Run with verbose output
uv run pytest -v

Test fixtures include:

Sample profiling scripts (fibonacci, memory-intensive, leaky)
Realistic Scalene JSON outputs
Edge cases and error conditions

Code Quality

This project follows strict code quality standards:

Type Safety: 100% mypy strict mode compliance
Linting: ruff with comprehensive rules
Testing: 100% coverage requirement
Style: Sleek-modern documentation, minimal functional emoji usage
Patterns: FastMCP best practices throughout

Development Phases

Current Status: Phase 1.1 - Project Setup ✓

Documentation

Editor Setup Guides:

GitHub Copilot Setup - Using Copilot Chat with VSCode
Claude Code Setup - Using Claude Code VSCode extension
Cursor Setup - Using the Cursor IDE
General VSCode Setup - General VSCode configuration

API & Usage:

Tools Reference - Complete API documentation (7 tools)
Quick Start - 3-step setup and basic workflows
Examples - Real-world profiling examples

Development Roadmap

Phase 1: Project Setup & Infrastructure ✓
Phase 2: Core Data Models (In Progress)
Phase 3: Profiler Integration
Phase 4: Analysis & Insights
Phase 5: Comparison Features
Phase 6: Resources Implementation
Phase 7: Prompts & Workflows
Phase 8: Testing & Quality
Phase 9: Documentation
Phase 10: Polish & Release

See development-plan.md for detailed roadmap.

Contributing

Contributions are welcome! Please ensure:

All tests pass (just test)
Linting passes (just lint)
Type checking passes (just typecheck)
Code coverage remains at 100%

License

[License TBD]

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

E2B

Using MCP to run code via e2b.

Official

Featured

Scalene-MCP

README

Scalene-MCP

Installation

Prerequisites

From Source

As a Package

Quick Start: Running the Server

Development Mode

Production Mode

🎯 Native Integration with LLM Agents

Zero-Friction Setup (3 Steps)

Start Profiling Immediately

Available Serving Methods (FastMCP)

1. Standard Server (Default)

2. With Claude Desktop

3. With HTTP/SSE Endpoint

4. With Environment Variables

5. Programmatically

Programmatic Usage

Overview

What Scalene-MCP Does

What Scalene-MCP Doesn't Do

Key Features

Available Tools (7 Consolidated Tools)

Discovery (3 tools)

Profiling (1 unified tool)

Analysis (1 mega tool)

Comparison & Storage (2 tools)

Configuration

Profiling Options

Environment Variables

Architecture

Components

Data Flow

Troubleshooting

GPU Permission Error

Profile Not Found

Timeout Issues

Development

Running Tests

Code Quality

Contributing

License

Citation

Support

Manual Installation

Development

Prerequisites

Setup

Project Structure

Usage

Running the Server

Example: Profile a Script

Example: Analyze Results

Testing

Code Quality

Development Phases

Documentation

Development Roadmap

Contributing

License

Links

Recommended Servers