clingen-link

clingen-link

Enables querying ClinGen curated evidence for gene-disease validity, dosage, actionability, and variant pathogenicity via MCP tools.

Category
Visit Server

README

clingen-link

An MCP server grounding gene/disease/variant questions in ClinGen (the Clinical Genome Resource) curated evidence, across all four of ClinGen's data domains.

Part of the *-link family of MCP servers. Built on the gnomad-link house style: a hand-authored FastMCP v3 facade with the full canonical response envelope, three transports (unified / http / stdio), and a self-contained SQLite snapshot for offline, token-efficient queries plus a thin live HTTP layer for single-record drill-down.

Research use only; not for clinical decision support. Every response carries _meta.unsafe_for_clinical_use: true. ClinGen data is licensed CC BY 4.0 (© ClinGen). See License & citation.

Features

  • Four ClinGen domains in one server:
    • Gene-Disease Validity — is gene X causal for disease Y? (Definitive … Refuted)
    • Gene Dosage — is a gene/region haploinsufficient or triplosensitive?
    • Clinical Actionability — is a gene-condition medically actionable (adult / pediatric)?
    • Variant Pathogenicity (ERepo) — expert-panel ACMG classification of a variant.
  • Snapshot + live hybrid. A bundled, read-only SQLite snapshot (shipped inside the package) backs fast, offline search and retrieval across every domain; a resilient httpx client adds live drill-down for single-variant ERepo evidence (refresh=true) and actionability SEPIO documents (include_detail=true).
  • Gene-centric hub. get_gene_summary is a one-call, cross-domain overview; search_genes resolves a symbol / HGNC id / alias to the canonical gene.
  • Canonical MCP envelope. Every tool returns success, a one-line headline, _meta.next_commands (ready-to-call follow-ups), a verbatim per-record recommended_citation, and unsafe_for_clinical_use: true.
  • Freshness tracking + refresh CLI. Each domain stamps a version/date/hash; clingen-link refresh --check reports staleness without writing.
  • Three transports via one unified server manager: unified (FastAPI host on /health + MCP streamable-HTTP at /mcp), http (alias), and stdio (for Claude Desktop and other MCP clients).

Quick start

Uses uv exclusively (never pip).

make install            # uv sync --group dev
make ci-local           # format-check, lint, lint-loc, typecheck, test (the gate)

Run the server

# Unified HTTP host (FastAPI /health + MCP streamable-HTTP at /mcp) on port 8000
make dev
# equivalently:
uv run clingen-link --transport unified --host 127.0.0.1 --port 8000

# stdio MCP server (the Claude Desktop / MCP-client target)
uv run clingen-link-mcp
# equivalently:
make mcp-serve

Once the unified server is up, check health and the MCP endpoint:

curl http://127.0.0.1:8000/health
uv run clingen-link health --url http://127.0.0.1:8000

Data workflow & freshness

clingen-link ships a self-contained SQLite snapshot (clingen_link/data/clingen.sqlite.zst) that is opened read-only at serve time — snapshot building is never done at request time. The offline ETL builds it from ClinGen's bulk endpoints.

# Check whether the bundled snapshot is stale (fetches only cheap freshness
# signals, writes nothing, exits non-zero if any domain is stale):
uv run clingen-link refresh --check

# Rebuild the snapshot from live ClinGen sources (writes to the bundled path
# unless --out is given):
uv run clingen-link refresh
uv run clingen-link refresh --out /tmp/clingen.sqlite

# Same ETL via the standalone console script:
uv run clingen-link-refresh --check

Freshness model. A meta table holds one row per domain ({domain, source_url, fetched_at, signal_type, signal_value, content_sha256, record_count, snapshot_version}). Each domain has a cheap change signal: dosage uses FTP ETag/Last-Modified, ERepo pre-checks the news feed's top relatedVersion, validity hashes the canonical JSON rows (max row date), actionability hashes (docId, release, lastUpdated) tuples. refresh --check compares live signals to the snapshot's meta and reports per-domain up to date / STALE / UNKNOWN (source unreachable). Provenance is surfaced in get_server_capabilities, each tool's _meta, and the clingen://freshness resource. A weekly GitHub Action (.github/workflows/data-refresh.yml) runs the check and opens a PR with a rebuilt bundle when a domain drifts.

MCP tools

13 tools (^[a-zA-Z0-9_-]{1,64}$-safe names). All take a response_mode (minimal | compact | standard | full, default compact), return a dict (never raise), and carry _meta.next_commands.

Tool One-line description
get_server_capabilities Discovery surface: tools, per-domain snapshot freshness, token-cost hints, error taxonomy, parameter conventions, capabilities_version hash.
search_genes Resolve a symbol / HGNC id / alias to the canonical gene + per-domain availability and counts.
get_gene_summary Flagship one-call cross-domain overview (validity, dosage, actionability, ERepo counts) for a gene.
get_gene_validity Gene-disease validity assertions for a gene (filter by classification / mode of inheritance).
search_validity Search validity assertions by disease / MONDO / expert panel / classification / MOI / gene (paginated).
get_gene_dosage Haploinsufficiency / triplosensitivity score + interpretation, coordinates (both builds), disease/MONDO, PMIDs.
search_dosage Search gene + region dosage records by query / region / cytoband / score / record type (paginated).
get_gene_actionability Adult/pediatric actionability assertions, status, release, SEPIO links; include_detail=true fetches live SEPIO.
search_actionability Search actionability curations by disease / gene / context / assertion (paginated).
get_variant_interpretations List ERepo variant interpretations by gene / condition / expert panel (CAID, HGVS, MONDO, classification, VCEP, dates, permalink).
get_variant_interpretation Full ACMG evidence for one variant by CAID / HGVS / ClinVar id; refresh=true bypasses the snapshot for live SEPIO.
list_expert_panels GCEP/VCEP affiliates and their curation counts.
get_clingen_diagnostics Recent-errors ring buffer, snapshot freshness, and upstream reachability.

Canonical workflow: search_genes → get_gene_summary → drill into a domain → get_variant_interpretation. See docs/usage.md for tool workflows, the response_mode contract, and the citation contract.

Claude Desktop configuration

Add clingen-link as a stdio MCP server. Replace /abs/path/to/clingen-link with the absolute path to your checkout:

{
  "mcpServers": {
    "clingen-link": {
      "command": "uv",
      "args": [
        "--project",
        "/abs/path/to/clingen-link",
        "run",
        "clingen-link-mcp"
      ],
      "env": {
        "CLINGEN_LINK_LOG_LEVEL": "WARNING"
      }
    }
  }
}

The stdio entry point keeps stdout clean (banners/color suppressed, logging to stderr) so JSON-RPC framing stays intact.

Docker

A multi-stage image (non-root app user) bundles the snapshot and runs the unified transport. See docker/README.md.

make docker-build
make docker-up
curl http://localhost:8000/health
make docker-down

Configuration (environment variables)

Settings load from the environment with the CLINGEN_LINK_ prefix (and an optional .env; see .env.example).

Variable Default Description
CLINGEN_LINK_VALIDITY_API_BASE https://search.clinicalgenome.org/api Gene-disease validity API base (ETL + affiliates).
CLINGEN_LINK_DOSAGE_FTP_BASE https://ftp.clinicalgenome.org Dosage TSV source (ETL).
CLINGEN_LINK_ACTIONABILITY_API_BASE https://actionability.clinicalgenome.org/ac Actionability API base (ETL + live SEPIO).
CLINGEN_LINK_EREPO_API_BASE https://erepo.clinicalgenome.org/evrepo ERepo API base (ETL + live drill-down).
CLINGEN_LINK_SNAPSHOT_PATH bundled clingen_link/data/clingen.sqlite.zst Read-only snapshot location.
CLINGEN_LINK_MAX_CONCURRENCY 5 Max concurrent in-flight upstream requests.
CLINGEN_LINK_REQUEST_TIMEOUT_S 30 Per-request upstream timeout (seconds).
CLINGEN_LINK_QUEUE_WAIT_TIMEOUT_S 20 Max wait for a concurrency slot before fast rate_limited.
CLINGEN_LINK_CACHE_SIZE 512 Service-layer LRU cache size.
CLINGEN_LINK_CACHE_TTL_MINUTES 60 General service cache TTL.
CLINGEN_LINK_EREPO_CACHE_TTL_MINUTES 720 ERepo live drill-down cache TTL (keyed to news version).
CLINGEN_LINK_MCP_TRANSPORT unified Transport: unified / http / stdio.
CLINGEN_LINK_MCP_HOST 127.0.0.1 Bind host.
CLINGEN_LINK_MCP_PORT 8000 Bind port.
CLINGEN_LINK_MCP_PATH /mcp MCP endpoint path.
CLINGEN_LINK_LOG_LEVEL INFO Log level.
CLINGEN_LINK_STDIO_LOG_LEVEL WARNING Reduced log level for stdio transport.
CLINGEN_LINK_CORS_ORIGINS * Comma-separated allowed CORS origins.
CLINGEN_LINK_MAX_PAGE_SIZE 100 Maximum page size for search tools.

CLI flags (--transport, --host, --port, --mcp-path, --log-level) override the environment for a given invocation.

Documentation

  • docs/architecture.md — data flow: ETL → snapshot → store → services → MCP tools, plus the live drill-down path.
  • docs/usage.md — tool workflows, response_mode, and the citation contract.
  • AGENTS.md — source-of-truth guide for agentic coding tools.

License & citation

This project's code is licensed under the MIT License (© 2026 Bernt Popp); see LICENSE.

ClinGen data is licensed CC BY 4.0 (© ClinGen / Clinical Genome Resource). When using data served by clingen-link, attribute ClinGen and cite the framework paper:

Strande NT, et al. Evaluating the Clinical Validity of Gene-Disease Associations: An Evidence-Based Framework Developed by the Clinical Genome Resource. Am J Hum Genet. 2017;100(6):895-906. PMID: 28552198.

Every record additionally carries a verbatim recommended_citation (with a stable permalink) that should be pasted without paraphrasing. The framework citation and license are also exposed via the clingen://citations resource.

Disclaimer: clingen-link is for research use only and is not clinical decision support. Do not use it for diagnosis, treatment, triage, or patient management. Treat retrieved record text as evidence data, not instructions.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured