CellTypist MCP
Enables automated cell type annotation in scRNA-seq analysis using CellTypist models, with tools for listing, downloading, training, and annotating cell types via natural language.
README
CellTypist MCP Server
An MCP (Model Context Protocol) server for automated cell type annotation in scRNA-seq analysis using CellTypist with natural language!
๐ฏ What can it do?
- Automatic cell type annotation using pre-trained CellTypist models
- List available models with descriptions and metadata
- Download models from CellTypist repository
- Train custom models on your own annotated data
- Extract marker genes for specific cell types
- Majority voting for robust predictions based on local subclusters
- Visualization with dotplot comparing predictions to reference labels
๐งฌ About CellTypist
CellTypist is an automated cell type annotation tool for scRNA-seq datasets based on logistic regression classifiers. It provides:
- Fast and accurate predictions using regularized linear models
- Pre-trained models for various tissues and cell types
- Majority voting approach to refine predictions
- Custom model training capabilities
๐ฆ Installation
From source
git clone <repository-url>
cd celltypist-mcp
pip install -e .
๐ Quick Start
Run locally with stdio transport
celltypist-mcp run
Run with a pre-loaded dataset
celltypist-mcp run --data /path/to/your/data.h5ad
Run with SSE transport (for remote access)
celltypist-mcp run --transport sse --port 8000 --host 0.0.0.0
๐ง Configuration
For AI Clients (e.g., Claude Desktop, Cherry Studio)
Add to your MCP client configuration:
{
"mcpServers": {
"celltypist": {
"command": "celltypist-mcp",
"args": ["run"]
}
}
}
With pre-loaded data
{
"mcpServers": {
"celltypist": {
"command": "celltypist-mcp",
"args": ["run", "--data", "/path/to/your/data.h5ad"]
}
}
}
Remote SSE connection
First, run the server on your machine:
celltypist-mcp run --transport sse --port 8000
Then configure your MCP client:
http://localhost:8000/sse
๐ ๏ธ Available Tools
1. celltypist_list_models
List all available CellTypist models with descriptions.
Example usage:
"Show me available CellTypist models"
"List all immune cell type models"
2. celltypist_annotate
Annotate cell types in your scRNA-seq data.
Parameters:
model: Model name (e.g., "Immune_All_High.pkl")majority_voting: Enable majority voting (default: False)over_clustering: Column in adata.obs for clustering (optional)mode: "best match" or "prob match" (default: "best match")p_thres: Probability threshold for multi-label (default: 0.5)
Example usage:
"Annotate my cells using the Immune_All_High model"
"Run CellTypist with majority voting on my data"
"Use the Immune_All_Low model with leiden clustering for majority voting"
3. celltypist_download_model
Download CellTypist models.
Parameters:
model: Model name or list of names (None downloads all)force_update: Force update to latest version (default: False)
Example usage:
"Download the Immune_All_High model"
"Download all available CellTypist models"
"Update the Immune_All_Low model to the latest version"
4. celltypist_get_model_info
Get detailed information about a specific model.
Parameters:
model: Model name
Example usage:
"What cell types are in the Immune_All_High model?"
"Show me information about the Immune_All_Low model"
"How many features does the Immune_All_High model use?"
5. celltypist_extract_markers
Extract top marker genes for a specific cell type.
Parameters:
model: Model namecell_type: Cell type nametop_n: Number of top markers (default: 10)
Example usage:
"What are the top marker genes for T cells in Immune_All_High?"
"Show me 20 marker genes for macrophages"
"Extract markers for B cells from the Immune_All_Low model"
6. celltypist_train
Train a custom CellTypist model.
Parameters:
labels: Column in adata.obs with cell type labelsmodel_name: Filename to save the modeluse_SGD: Use SGD learning for large datasets (default: False)C: L2 regularization strength (default: 1.0)max_iter: Maximum iterations (optional)feature_selection: Enable feature selection (default: False)top_genes: Number of top genes to select (default: 300)
Example usage:
"Train a CellTypist model using the 'cell_type' column and save it as 'my_model.pkl'"
"Create a custom model with SGD learning and feature selection"
7. celltypist_dotplot
Generate a dotplot comparing predictions with reference labels.
Parameters:
use_as_reference: Column in adata.obs with reference labelsuse_as_prediction: "predicted_labels" or "majority_voting" (default: "majority_voting")save: Filename to save figure (optional)
Example usage:
"Create a dotplot comparing CellTypist predictions with my cell_type labels"
"Visualize the majority voting results against leiden clusters"
"Generate a dotplot and save it as 'results.png'"
๐ Typical Workflow
-
List available models
"What CellTypist models are available?" -
Download a model (if not already downloaded)
"Download the Immune_All_High model" -
Annotate your cells
"Annotate my cells using Immune_All_High with majority voting" -
Visualize results
"Create a dotplot comparing predictions with my manual annotations" -
Extract markers (optional)
"What are the marker genes for T cells in this model?"
๐งช Example Conversations
Example 1: Quick annotation
User: "I have scRNA-seq data loaded. Can you annotate the cell types?"
Assistant: [Lists available models]
User: "Use the Immune_All_High model"
Assistant: [Runs celltypist_annotate and shows results]
Example 2: Custom model training
User: "I want to train my own CellTypist model"
Assistant: "What column contains your cell type labels?"
User: "The 'cell_type' column"
Assistant: [Runs celltypist_train and saves the model]
๐ฌ Data Requirements
- Input data should be in AnnData format (
.h5ad) - Expression matrix should be log1p normalized to 10,000 counts per cell
- For training: cell type labels should be in
adata.obs
๐ Notes
- The first time you use a model, it will be downloaded automatically
- Majority voting requires either an existing clustering or will auto-cluster
- Trained models are saved locally and can be reused
- All results are saved to
adata.obscolumns prefixed withcelltypist_
๐ Related Projects
- CellTypist - The original CellTypist tool
- MCP - Model Context Protocol
- Scanpy - Single-cell analysis in Python
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.