NCBI Datasets MCP Server
Provides comprehensive access to the NCBI Datasets API, enabling genomic, taxonomic, and biological data operations through 31 specialized tools.
README

Unofficial NCBI Datasets MCP Server
A Model Context Protocol (MCP) server that provides comprehensive access to the NCBI Datasets API. This server enables seamless integration with NCBI's vast collection of genomic, taxonomic, and biological data through 31 specialized tools.
Developed by Augmented Nature
Features
- 31 comprehensive tools covering all major NCBI Datasets functionality
- 9 organized categories of biological data operations
- Resource templates for direct URI-based data access
- Full TypeScript implementation with proper error handling
- Rate limiting and caching for optimal performance
- Environment variable configuration for API keys
Installation
npm install
npm run build
Configuration
Environment Variables
NCBI_API_KEY(optional): Your NCBI API key for higher rate limits and priority access
MCP Configuration
Add to your MCP settings:
{
"mcpServers": {
"ncbi-datasets-server": {
"command": "node",
"args": ["/path/to/ncbi-datasets-server/build/index.js"],
"env": {
"NCBI_API_KEY": "your_api_key_here"
}
}
}
}
Available Tools
๐งฌ Genome Operations
search_genomes- Search genome assemblies by organism, keywords, or criteriaget_genome_info- Get detailed information for a specific genome assemblyget_genome_summary- Get summary statistics for a genome assembly
๐งฌ Gene Operations
search_genes- Search genes by symbol, name, organism, or locationget_gene_info- Get detailed information for a specific geneget_gene_sequences- Retrieve sequences for a specific gene
๐ท๏ธ Taxonomy Operations
search_taxonomy- Search taxonomic information by organism nameget_taxonomy_info- Get detailed taxonomic information for a taxonget_organism_info- Get organism-specific information and datasets
๐๏ธ Assembly Operations
search_assemblies- Search genome assemblies with detailed filteringget_assembly_info- Get detailed metadata and statistics for assembliesget_assembly_reports- Get assembly quality reports and validation infodownload_genome_data- Get download URLs for genome data filesbatch_assembly_info- Get information for multiple assemblies
๐ฆ Virus Operations
search_virus_genomes- Search viral genome assembliesget_virus_info- Get detailed information for viral genomes
๐งช Protein Operations
search_proteins- Search protein sequences by name or functionget_protein_info- Get detailed information for specific proteins
๐ Annotation Operations
get_genome_annotation- Get annotation information for assembliessearch_genome_features- Search for specific genomic features
๐ฌ Comparative Genomics
compare_genomes- Compare two or more genome assembliesfind_orthologs- Find orthologous genes across organisms
๐งฌ Sequence Operations
get_sequence_data- Retrieve sequence data for genomes/genes/proteinsblast_search- Perform BLAST search against NCBI databases
๐ณ Phylogenetic Operations
get_phylogenetic_tree- Get phylogenetic tree data for organismsget_taxonomic_lineage- Get complete taxonomic lineage
๐ Statistics Operations
get_database_stats- Get statistics about NCBI Datasets contentsearch_by_bioproject- Search datasets by BioProject accessionsearch_by_biosample- Search datasets by BioSample accession
โ Quality Control
get_assembly_quality- Get quality metrics for genome assembliesvalidate_sequences- Validate sequence data and check for issues
Usage Examples
Genome Analysis
// Search for E. coli genomes
{
"tool": "search_genomes",
"arguments": {
"tax_id": 511145,
"assembly_level": "complete",
"max_results": 10
}
}
// Get detailed genome information
{
"tool": "get_genome_info",
"arguments": {
"accession": "GCF_000005845.2",
"include_annotation": true
}
}
// Get genome summary statistics
{
"tool": "get_genome_summary",
"arguments": {
"accession": "GCF_000005845.2"
}
}
Gene Research
// Search for BRCA1 gene
{
"tool": "search_genes",
"arguments": {
"gene_symbol": "BRCA1",
"organism": "Homo sapiens",
"max_results": 5
}
}
// Get detailed gene information
{
"tool": "get_gene_info",
"arguments": {
"gene_id": 672,
"include_sequences": true
}
}
// Get gene sequences
{
"tool": "get_gene_sequences",
"arguments": {
"gene_id": 672,
"sequence_type": "transcript"
}
}
Taxonomic Analysis
// Search taxonomy by organism name
{
"tool": "search_taxonomy",
"arguments": {
"query": "Escherichia coli",
"max_results": 10
}
}
// Get detailed taxonomic information
{
"tool": "get_taxonomy_info",
"arguments": {
"tax_id": 511145,
"include_lineage": true
}
}
// Get organism information
{
"tool": "get_organism_info",
"arguments": {
"organism": "Escherichia coli"
}
}
Assembly Operations
// Search assemblies with filtering
{
"tool": "search_assemblies",
"arguments": {
"query": "human",
"assembly_level": "chromosome",
"assembly_source": "refseq",
"max_results": 20
}
}
// Get assembly information
{
"tool": "get_assembly_info",
"arguments": {
"assembly_accession": "GCF_000001405.40",
"include_annotation": true
}
}
// Batch assembly lookup
{
"tool": "batch_assembly_info",
"arguments": {
"accessions": ["GCF_000001405.40", "GCF_000005825.2", "GCF_000002305.1"]
}
}
Comparative Genomics
// Compare multiple genomes
{
"tool": "compare_genomes",
"arguments": {
"accessions": ["GCF_000005845.2", "GCF_000001405.40"],
"comparison_type": "basic_stats",
"include_orthologs": true
}
}
// Find orthologous genes
{
"tool": "find_orthologs",
"arguments": {
"gene_symbol": "BRCA1",
"source_organism": "Homo sapiens",
"target_organisms": ["Mus musculus", "Rattus norvegicus"],
"similarity_threshold": 80
}
}
Virus Research
// Search viral genomes
{
"tool": "search_virus_genomes",
"arguments": {
"virus_name": "SARS-CoV-2",
"host": "Homo sapiens",
"max_results": 50
}
}
// Get viral genome information
{
"tool": "get_virus_info",
"arguments": {
"accession": "NC_045512.2",
"include_proteins": true,
"include_metadata": true
}
}
Resource Templates
The server provides resource templates for direct data access:
ncbi://genome/{accession}- Complete genome assembly informationncbi://gene/{gene_id}- Gene information with annotationsncbi://taxonomy/{tax_id}- Taxonomic classification and lineagencbi://assembly/{assembly_accession}- Assembly metadata and statisticsncbi://search/{data_type}/{query}- Search results for specified queries
API Rate Limits
- Without API key: 3 requests per second
- With API key: 10 requests per second with priority access
To obtain an API key, visit: https://www.ncbi.nlm.nih.gov/account/settings/
Error Handling
The server implements comprehensive error handling:
- Network errors: Automatic retry with exponential backoff
- Rate limiting: Intelligent request queuing and throttling
- Invalid parameters: Clear validation error messages
- API errors: Detailed error reporting with context
Data Sources
This server accesses data from:
- NCBI Datasets API v2: Primary genomic and assembly data
- NCBI Taxonomy: Taxonomic classifications and lineages
- NCBI Gene: Gene annotations and sequences
- NCBI Assembly: Assembly metadata and quality metrics
- NCBI BioProject/BioSample: Project and sample information
License
MIT License - see LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.
Support
For issues related to:
- Server functionality: Open an issue in this repository
- NCBI data: Consult NCBI Datasets documentation
- API access: Contact NCBI support for API-related questions
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.