Bio-MCP FastQC Server

Bio-MCP FastQC Server

Enables AI assistants to perform quality control analysis on high-throughput sequencing data using FastQC and MultiQC. It supports single-file and batch processing of FASTQ/FASTA files and generates comprehensive, interactive summary reports.

Category
Visit Server

README

Bio-MCP FastQC Server šŸ”¬

Quality Control Analysis via Model Context Protocol

An MCP server that enables AI assistants to run FastQC and MultiQC quality control analysis on sequencing data. Part of the Bio-MCP ecosystem.

šŸŽÆ Purpose

FastQC is essential for quality assessment of high-throughput sequencing data. This MCP server allows AI assistants to:

  • Analyze single files - Get detailed QC reports for individual FASTQ/FASTA files
  • Batch process - Run QC on multiple files simultaneously
  • Generate summary reports - Create MultiQC reports combining multiple analyses
  • Handle large datasets - Queue system support for computationally intensive jobs

šŸš€ Quick Start

Prerequisites

Install FastQC and MultiQC:

# Via conda (recommended)
conda install -c bioconda fastqc multiqc

# Via package managers
# Ubuntu/Debian
sudo apt-get install fastqc
pip install multiqc

# macOS
brew install fastqc
pip install multiqc

Installation

# Clone and install
git clone https://github.com/bio-mcp/bio-mcp-fastqc.git
cd bio-mcp-fastqc
pip install -e .

# Or install directly
pip install git+https://github.com/bio-mcp/bio-mcp-fastqc.git

Claude Desktop Configuration

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "bio-fastqc": {
      "command": "python",
      "args": ["-m", "src.server"],
      "cwd": "/path/to/bio-mcp-fastqc"
    }
  }
}

šŸ”§ Available Tools

Core Analysis Tools

fastqc_single

Run FastQC on a single FASTQ/FASTA file.

Parameters:

  • input_file (required): Path to FASTQ or FASTA file
  • threads (optional): Number of threads (default: 1)
  • contaminants (optional): Path to custom contaminants file
  • adapters (optional): Path to custom adapters file
  • limits (optional): Path to custom limits file

Example:

User: "Run quality control on my_sample.fastq.gz"
AI: [calls fastqc_single] → Returns detailed QC report with pass/warn/fail status for each module

fastqc_batch

Run FastQC on multiple files in a directory.

Parameters:

  • input_dir (required): Directory containing FASTQ/FASTA files
  • file_pattern (optional): File pattern to match (default: ".fastq")
  • threads (optional): Number of threads (default: 4)

Example:

User: "Analyze all fastq files in the data/ directory"
AI: [calls fastqc_batch] → Processes all files and returns summary statistics

multiqc_report

Generate MultiQC report from FastQC results.

Parameters:

  • input_dir (required): Directory containing FastQC and other analysis results
  • title (optional): Custom title for the report
  • comment (optional): Comment to add to the report
  • template (optional): Report template (default, simple, sections, gathered)

Example:

User: "Create a summary report from all the QC results"
AI: [calls multiqc_report] → Generates interactive HTML report combining all analyses

Queue System Tools (when queue enabled)

For large datasets or batch processing:

  • fastqc_single_async - Queue single file analysis
  • fastqc_batch_async - Queue batch analysis
  • multiqc_report_async - Queue report generation
  • get_job_status - Check job progress
  • get_job_result - Retrieve completed results
  • cancel_job - Cancel running jobs

šŸ’” Usage Examples

Basic Quality Assessment

User: "What's the quality of my sequencing data in sample1.fastq.gz?"

AI Response:
āœ… FastQC Analysis Complete for sample1

Module Status Summary:
āœ… Basic Statistics: PASS
āœ… Per base sequence quality: PASS
āš ļø Per sequence quality scores: WARN
āœ… Per base sequence content: PASS
āœ… Per sequence GC content: PASS
āœ… Per base N content: PASS
āœ… Sequence Length Distribution: PASS
āŒ Sequence Duplication Levels: FAIL
āœ… Overrepresented sequences: PASS
āœ… Adapter Content: PASS

Basic Statistics:
• Total Sequences: 2,450,000
• Sequences flagged as poor quality: 0
• Sequence length: 150
• %GC: 47

Batch Analysis

User: "Analyze all FASTQ files in my project/raw_data/ folder"

AI Response:
FastQC Batch Analysis Complete
========================================

Processed 12 files:

āœ… sample01_R1.fastq.gz: 8P/2W/0F
āœ… sample01_R2.fastq.gz: 8P/2W/0F
āš ļø sample02_R1.fastq.gz: 7P/2W/1F
āš ļø sample02_R2.fastq.gz: 7P/2W/1F
...

Overall Summary:
• Total PASS: 94
• Total WARN: 24
• Total FAIL: 2

Tip: Run multiqc_report on this directory to generate a combined report!

Complete Workflow

User: "I have a directory of paired-end FASTQ files. Can you run quality control and create a summary report?"

AI: I'll run a complete QC workflow on your paired-end data:

1. First, let me analyze all FASTQ files in batch:
   [runs fastqc_batch on directory]

2. Now I'll generate a MultiQC summary report:
   [runs multiqc_report on results]

āœ… Complete QC workflow finished!

Summary:
- 24 FASTQ files processed (12 samples, paired-end)
- Average quality score: 32.5
- 2 samples have adapter contamination warnings
- 1 sample shows high duplication levels
- Interactive HTML report generated: multiqc_report.html

The MultiQC report provides detailed visualizations of:
- Quality score distributions across all samples
- GC content comparison
- Sequence length distributions
- Adapter content analysis
- Sample correlation analysis

🐳 Docker Usage

Build and Run

# Build the image
docker build -t bio-mcp-fastqc .

# Run with data mounting
docker run -v /path/to/data:/data bio-mcp-fastqc

Docker Compose (with Queue System)

services:
  fastqc-server:
    build: .
    volumes:
      - ./data:/data
    environment:
      - BIO_MCP_QUEUE_URL=http://queue-api:8000
    depends_on:
      - queue-api

āš™ļø Configuration

Environment Variables

  • BIO_MCP_FASTQC_PATH - Path to FastQC executable (default: "fastqc")
  • BIO_MCP_MULTIQC_PATH - Path to MultiQC executable (default: "multiqc")
  • BIO_MCP_MAX_FILE_SIZE - Maximum file size in bytes (default: 10GB)
  • BIO_MCP_TIMEOUT - Command timeout in seconds (default: 1800)
  • BIO_MCP_TEMP_DIR - Temporary directory for processing

Queue System Integration

To enable async processing for large datasets:

from src.server_with_queue import FastQCServerWithQueue

server = FastQCServerWithQueue(queue_url="http://localhost:8000")

šŸ“Š Output Files

FastQC generates several output files:

  • HTML Report (*_fastqc.html) - Interactive quality report
  • Data File (fastqc_data.txt) - Raw metrics and statistics
  • Summary File (summary.txt) - Pass/warn/fail status for each module
  • Plots - Various quality plots and charts

MultiQC combines these into:

  • MultiQC Report (multiqc_report.html) - Combined interactive report
  • Data Directory (multiqc_data/) - Processed data and statistics
  • General Stats (multiqc_general_stats.txt) - Summary table

šŸ” Quality Metrics Explained

FastQC analyzes multiple quality aspects:

Key Modules

  • Per base sequence quality - Quality scores across read positions
  • Per sequence quality scores - Distribution of mean quality scores
  • Per base sequence content - A/T/G/C content across positions
  • Per sequence GC content - GC% distribution vs expected
  • Sequence duplication levels - PCR duplication assessment
  • Adapter content - Contaminating adapter sequences

Status Interpretation

  • āœ… PASS - Analysis indicates no problems
  • āš ļø WARN - Slightly unusual, may not be problematic
  • āŒ FAIL - Likely problematic, requires attention

🧬 Integration with Bio-MCP Ecosystem

FastQC works seamlessly with other Bio-MCP tools:

User: "Run the complete preprocessing pipeline on my samples"

AI Workflow:
1. fastqc_batch → Initial quality assessment
2. trimmomatic → Trim low-quality bases and adapters  
3. fastqc_batch → Post-trimming QC
4. multiqc_report → Combined before/after report

šŸ¤ Contributing

We welcome contributions! See the Bio-MCP contributing guide.

Development Setup

git clone https://github.com/bio-mcp/bio-mcp-fastqc.git
cd bio-mcp-fastqc
pip install -e ".[dev]"
pytest

šŸ“„ License

MIT License - see LICENSE file.

šŸ™ Acknowledgments

  • FastQC by Simon Andrews at Babraham Bioinformatics
  • MultiQC by Phil Ewels and the MultiQC community
  • Bio-MCP project and contributors

Part of the Bio-MCP ecosystem - Making bioinformatics accessible to AI assistants.

For more tools: Bio-MCP Organization

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured