microbeFunction_mcp
Supports querying and downloading MGnify genome data and performing KEGG functional annotation and module completeness analysis for microbial genomes.
README
microbeFunction_mcp
微生物功能分析 MCP 工具包
支持mgnify_genomes数据库所有菌种(包含人肠道,口腔,阴道;小鼠肠道;海洋淤泥;土壤;根际;牛羊瘤胃等)的基因组信息下载和查询 支持KEGG所有数据库查询和功能注释,代谢模块完整度分析,构建细菌和古菌KEGG pathway/module的白名单及只在真核中出现的terms的黑名单及关键词
概览
| 工具包 | 功能 | 端口 |
|---|---|---|
tools/kegg |
KEGG REST API + 代谢模块完整度分析 + COG数据库(依赖 kegg-pathways-completeness CLI) |
8791 |
tools/mgnify |
MGnify 物种索引检索 + 注释文件按需下载 | 8792 |
deploy/ |
统一 FastAPI 部署层,同时挂载两个 MCP | 8788 |
部署
1. 安装依赖
pip install uv
uv venv .venv --python=3.13
# Windows: .venv\Scripts\activate
# Linux/macOS: source .venv/bin/activate
uv sync
2. 配置
cp default.conf.toml local.conf.toml
# 按需修改 local.conf.toml 中的端口等配置
3. 启动统一服务
uv run -m deploy.web
启动后:
- KEGG MCP:
http://127.0.0.1:8788/kegg_mcp/mcp/ - MGnify MCP:
http://127.0.0.1:8788/mgnify_mcp/mcp/ - 服务列表:
http://127.0.0.1:8788/api/list_mcps
4. 单独启动某个 MCP
# KEGG (端口 8791)
uv run -m tools.kegg.deploy
# MGnify (端口 8792)
uv run -m tools.mgnify.deploy
典型工作流
search_species (mgnify) → fetch_annotations (mgnify) → kegg_module_completeness (kegg)
找到物种 MAG 下载 eggNOG/GFF 注释 分析代谢模块完整度
使用示例:生成 Clostridium baratii 功能画像
步骤 1:搜索物种
uv run -m tools.mgnify search --biome human-gut --query "Clostridium baratii"
输出示例:
{
"items": [
{
"species_rep": "MGYG000000064",
"species_name": "Clostridium baratii",
"completeness": 99.19,
"contamination": 1.61,
"genome_count": 12
}
]
}
步骤 2:下载注释
uv run -m tools.mgnify fetch --species-rep MGYG000000064 --biome human-gut --roles eggnog_tsv,gff
注释文件会保存在 downloads/MGYG000000064/genome/ 目录下。
步骤 3:分析代谢模块完整度
uv run -m tools.kegg analyze \
--annotation-file downloads/MGYG000000064/genome/MGYG000000064_eggNOG.tsv \
--kegg-column KEGG_ko \
--output downloads/MGYG000000064/genome/MGYG000000064_module_completeness.tsv
输出示例:
output=downloads/MGYG000000064/genome/MGYG000000064_module_completeness.tsv
unique_ko_count=1384 modules_with_any_hit=180 modules_above_threshold=180
步骤 4:解读结果
| 功能分类 | 完整模块数/总模块数 |
|---|---|
| Carbohydrate metabolism | 8/30 |
| Amino acid metabolism | 12/39 |
| Energy metabolism | 4/23 |
| Nucleotide metabolism | 6/10 |
| Glycan metabolism | 6/15 |
| Cofactor and vitamin metabolism | 7/38 |
| Lipid metabolism | 3/9 |
代表性完整模块:
- M00001: 糖酵解 (Embden-Meyerhof pathway)
- M00632: 半乳糖降解 (Leloir pathway)
- M00579: 乙酸生成 (磷酸乙酰转移酶-乙酸激酶)
- M00651: 万古霉素耐药 (D-Ala-D-Lac type)
- M00122: 钴胺素生物合成 (维生素B12)
- M00924: 钴胺素生物合成 (厌氧途径)
部分完整模块:
- M00003: 糖酵解 (78.57%)
- M00010: 乙醇发酵 (66.67%)
- M00009: 乳酸发酵 (50.0%)
基因组概览:
- 总基因数:2,893
- 有 KO 注释的基因:1,647 (56%)
- COG 分类:碳水化合物代谢 (229)、氨基酸代谢 (191)、能量代谢 (171)
- CAZy 酶:糖基转移酶 (GT) 22 个、糖苷水解酶 (GH) 16 个
CLI 用法
# KEGG 模块完整度分析
uv run -m tools.kegg analyze --annotation-file downloads/MGYG000000238/genome/MGYG000000238_gene_annotations.tsv
# KEGG 批处理
uv run -m tools.kegg batch --manifest manifest.tsv --jobs 4
# MGnify 物种搜索
uv run -m tools.mgnify search --biome human-gut --query "Enterobacter kobei"
# MGnify 注释下载
uv run -m tools.mgnify fetch --species-rep MGYG000000238 --biome human-gut --roles eggnog_tsv
数据路径
- MGnify 索引:
data/mgnify/ - MGnify 注释缓存:
downloads/{species_rep}/genome/ - KEGG allowlist 数据:
tools/kegg/data/ - COG 目录:
tools/kegg/data/COG.csv
可通过环境变量 MGNIFY_DATA_DIR 覆盖 MGnify 索引目录。
项目结构
microbeFunction_mcp/
├── tools/
│ ├── kegg/ # KEGG MCP
│ │ ├── server.py # FastMCP server
│ │ ├── kegg_api.py # KEGG REST API 封装
│ │ ├── module_completeness.py
│ │ ├── cog_store.py
│ │ ├── ko_input.py
│ │ ├── allowlist_sources.py
│ │ ├── cli.py
│ │ ├── deploy.py # 单独部署入口
│ │ └── data/ # allowlist 等数据文件
│ └── mgnify/ # MGnify MCP
│ ├── server.py
│ ├── fetch.py
│ ├── index_store.py
│ ├── cli.py
│ ├── deploy.py
│ ├── download_species_annotations.py
│ ├── download_human_gut_metadata.py
│ ├── list_mgnify_folders.py
│ └── build_mgnify_index.py
├── deploy/
│ ├── web.py # 统一 FastAPI 入口
│ └── config.py # 配置加载
├── tests/
├── data/
│ └── mgnify/ # 共享数据
├── pyproject.toml
├── default.conf.toml
└── .gitignore
Cursor / Claude Desktop 配置
{
"mcpServers": {
"kegg_mcp": {
"url": "http://127.0.0.1:8788/kegg_mcp/mcp/",
"transport": "streamable_http"
},
"mgnify_mcp": {
"url": "http://127.0.0.1:8788/mgnify_mcp/mcp/",
"transport": "streamable_http"
}
}
}
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.