openpgx
Enables AI-driven pharmacogenomic analysis by querying structured genetic variant, drug response, and disease risk data. Supports natural language questions about medications, traits, and health risks based on user genome data, with privacy-first local execution.
README
OpenPGx — The Open Standard for AI-Readable Pharmacogenomics
OpenPGx is an open, AI-readable standard for pharmacogenomic data. It defines how genetic variants, drug responses, disease risks, and traits should be structured so that any AI system can understand and reason about them.
The standard is delivered today through an MCP Server — plug it into Claude, Cursor, or any MCP-compatible AI and start asking questions about your DNA.
"Can I take Ozempic?" → Checks GLP1R variants against your genotype
"How about Vyvanse?" → Cross-references CYP2D6 + COMT studies
"What's my Alzheimer risk?" → 19 disease conditions analyzed with odds ratios
Privacy-first: your data never leaves your computer. No cloud, no account, no tracking.
Why OpenPGx?
Pharmacogenomic data is trapped in formats that AI can't use. FHIR is verbose and hospital-centric. PharmCAT is great but not designed for AI consumption. Research papers are unstructured text.
OpenPGx is different:
- AI-readable — structured JSON that any LLM can parse and reason about
- Study-driven — every interpretation traces back to a published study with PMID/DOI
- Open source — add a study by dropping a JSON file and opening a PR
- FHIR-compatible — not a replacement, but an intelligence layer that exports to FHIR resources
The OpenPGx Study Format
One JSON file = one gene-drug study. This is the atomic unit of pharmacogenomic knowledge in OpenPGx:
{
"gene": "ALDH2",
"category": "drug_metabolism",
"gene_description": "Alcohol metabolism, nitroglycerin bioactivation",
"drugs": ["nitroglycerin", "ethanol"],
"source": {
"pmid": "16395407",
"doi": "10.1172/JCI26564",
"source_type": "pubmed",
"title": "ALDH2 Glu504Lys polymorphism and nitroglycerin efficacy",
"journal": "Journal of Clinical Investigation",
"year": 2006,
"cohort_size": 986,
"url": "https://pubmed.ncbi.nlm.nih.gov/16395407/",
"finding": "ALDH2*2 carriers show reduced nitroglycerin bioactivation."
},
"snps": [
{
"rsid": "rs671",
"risk_allele": "A",
"reference_allele": "G",
"interpretations": {
"AA": { "phenotype": "ALDH2 Deficient", "effect": "Nitroglycerin ineffective", "severity": "severe" },
"AG": { "phenotype": "Reduced Activity", "effect": "33-40% less efficacy", "severity": "moderate" },
"GG": { "phenotype": "Normal", "effect": "Standard response", "severity": "info" }
}
}
],
"evidence_level": "established"
}
Want to contribute a new drug-gene study? Create a file like this in data/pgx/studies/ and open a PR. No code changes needed.
Install in 30 seconds
Claude Desktop
Add to your claude_desktop_config.json:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/claude/claude_desktop_config.json
{
"mcpServers": {
"openpgx": {
"command": "npx",
"args": ["-y", "openpgx"]
}
}
}
Restart Claude Desktop. Done.
Claude Code (CLI)
claude mcp add openpgx -- npx -y openpgx
Cursor
Open Settings > MCP > Add new MCP Server:
{
"mcpServers": {
"openpgx": {
"command": "npx",
"args": ["-y", "openpgx"]
}
}
}
Windsurf / Cline / Any MCP client
Same configuration. OpenPGx uses stdio transport — any MCP-compatible client works:
npx openpgx
Remote server (no install)
Don't want to install anything? Use the hosted server directly:
{
"mcpServers": {
"openpgx": {
"type": "streamable-http",
"url": "https://mcp.openpgx.ai/mcp"
}
}
}
This connects to our remote MCP server via Streamable HTTP — same 9 tools, zero local setup. Your genome is parsed server-side and stored temporarily in memory (30-minute session TTL).
Privacy note: For maximum privacy, prefer the
npxinstall — everything stays on your machine. The remote server processes your data but does not store it permanently.
What can you ask?
Medications (Pharmacogenomics)
"Upload my genome" → parse your raw DNA file
"Can I take Ozempic?" → check semaglutide + GLP1R
"What about Venvanse?" → brand names work (60+ brands mapped)
"Is modafinil right for me?" → checks COMT + CYP2C19 interactions
"Compare sertraline vs escitalopram" → head-to-head comparison
"Weight loss medications" → search by category
"antidepressivo" → Portuguese works too
"ozmpic" → typos are auto-corrected
Disease Risks
"What's my cancer risk?" → check specific conditions
"Full risk report" → all 19 conditions analyzed
"Do I have the Alzheimer gene?" → APOE status
"Am I at risk for blood clots?" → Factor V Leiden check
Traits
"Trait report" → all 25+ traits
"Am I lactose intolerant?" → lactose persistence check
"Am I a morning person?" → chronotype analysis
Supplements
"Supplement protocol" → MTHFR, COMT, VDR, BCMO1, FUT2, CBS
"Should I take methylfolate?" → based on your MTHFR status
What's in the knowledge base
118 studies · 109 genes · 219 drugs · 19 disease risks · 31 traits — all backed by published research with PMID/DOI.
Genes with study data (109)
Every gene is backed by at least one published study with interpretations per genotype. The authoritative list is the set of gene fields in data/pgx/studies/*.json; the table below groups major clinical themes (some genes appear in more than one theme in the data).
| Category | Genes (representative) |
|---|---|
| Drug Metabolism | CYP2D6, CYP2C19, CYP2C9, CYP2B6, CYP3A4, CYP3A5, CYP1A2, CYP2E1, ALDH2, BCHE, CES1, FAAH, NAT2, UGT1A1, UGT1A4, UGT1A9, GSTP1, ACE, ADRB1, ADRB2, AGTR1, HMGCR, PCSK9 |
| Drug Targets & Response | VKORC1, DRD2, DRD3, DRD4, HTR1A, HTR2A, HTR2C, GABRA2, GABRA6, GLP1R, GRIK4, SCN1A, SCN9A, MC1R, RYR1, TPMT, NUDT15, DPYD, FKBP5, ESR1, SHBG, COMT, OPRM1, TCF7L2, CLOCK |
| ADHD / Stimulants | ADRA2A, SLC6A2, SLC6A3, SLC6A4, SNAP25 |
| Drug Transport | SLCO1B1, ABCB1, ABCG2, SLC22A1, SLC22A2 |
| Immune / HLA | HLA-B, HLA-A, HLA-C, HLA-DQB1, IL23R |
| GLP-1 / Incretin | GLP1R, GIPR, MC4R, PCSK1 |
| Methylation & Vitamins | MTHFR, COMT, VDR, BCMO1, FUT2, CBS, DHCR7, SLC23A1, GC, TCN2, NBPF3 |
| Lipid Metabolism | PNPLA3, TM6SF2, APOE, APOA5, APOB, LDLR, LIPC, LPL, SORT1, PPARG |
| Energy Balance / Obesity | FTO, MC4R, GHSR, BDNF, PCSK1, TMEM18, FABP2, LYPLAL1 |
| Glucose / Insulin | GCKR, SLC2A2, PPM1K, MTNR1B, IL6, C11ORF65 |
| Circadian Rhythm | PER2, CRY2, NR1D1, CLOCK |
| Coagulation | F2, F5 |
| Cannabinoid / Related | CNR1, AKT1, CYP2C9 |
| Salt Sensitivity | GRK4, CLCNKA |
127 Medications & Compounds
<details> <summary>Full list of supported drugs (click to expand)</summary>
| Therapeutic Area | Medications |
|---|---|
| Cardiology & Anticoagulation | warfarin, clopidogrel, rivaroxaban, apixaban, dabigatran, enoxaparin, heparin, nitroglycerin, isosorbide dinitrate, digoxin, amlodipine, losartan, hydrochlorothiazide, chlorthalidone, furosemide |
| Statins & Lipid-Lowering | simvastatin, atorvastatin, rosuvastatin, pravastatin, ezetimibe, fenofibrate, gemfibrozil, niacin, evolocumab, alirocumab, mipomersen |
| Psychiatry & Neurology | clozapine, olanzapine, risperidone, haloperidol, aripiprazole, quetiapine, fluoxetine, paroxetine, escitalopram, citalopram, venlafaxine, bupropion, carbamazepine, oxcarbazepine, eslicarbazepine, phenytoin, donepezil, memantine, lecanemab |
| ADHD & Wakefulness | modafinil, armodafinil, methylphenidate, lisdexamfetamine, amphetamine, atomoxetine, caffeine |
| Pain & Opioids | codeine, tramadol, fentanyl, morphine, methadone |
| Oncology | capecitabine, fluorouracil, cisplatin, carboplatin, oxaliplatin, paclitaxel, topotecan, methotrexate, tamoxifen |
| Immunosuppressants | tacrolimus, azathioprine, mercaptopurine, thioguanine, sulfasalazine |
| Metabolic & GLP-1 | semaglutide, liraglutide, tirzepatide, dulaglutide, setmelanotide, metformin, pioglitazone, rosiglitazone, insulin, orlistat, empagliflozin, dapagliflozin |
| Infectious Disease | abacavir, efavirenz |
| Supplements & Vitamins | omega-3/fish oil/EPA/DHA, vitamin D (cholecalciferol, ergocalciferol, calcitriol), vitamin C (ascorbic acid), folic acid, methylfolate, methylcobalamin, beta-carotene, retinol, resveratrol, nicotinamide riboside, NMN, melatonin, pyridoxine |
| Other | allopurinol, febuxostat, theophylline, tizanidine, cannabidiol, ethanol, disulfiram, celecoxib, ibuprofen, naproxen |
</details>
19 Disease Risk Conditions
| Category | Conditions |
|---|---|
| Oncology | Prostate Cancer, Breast Cancer (BRCA), Colorectal Cancer, Melanoma |
| Cardiovascular | Coronary Artery Disease, Atrial Fibrillation |
| Neurological | Alzheimer's Disease, Parkinson's Disease |
| Metabolic | Type 2 Diabetes, Hereditary Hemochromatosis, Gout |
| Autoimmune | Celiac Disease, Psoriasis, Rheumatoid Arthritis, Lupus (SLE) |
| Hematological | Venous Thromboembolism (Factor V Leiden) |
| Musculoskeletal | Osteoporosis |
| Respiratory | Asthma |
| Ophthalmological | Age-Related Macular Degeneration |
31 Traits
Caffeine metabolism, alcohol flush, lactose tolerance, muscle composition, chronotype, eye color, and more.
Clinical Conditions Covered by Gene Studies
Beyond disease risk SNPs, the gene studies provide pharmacogenomic guidance across these clinical areas:
| Area | What OpenPGx covers |
|---|---|
| Blood Clotting | Factor V Leiden (F5), Prothrombin mutation (F2), warfarin sensitivity (VKORC1, CYP2C9) |
| Drug Hypersensitivity | HLA-B57:01 → abacavir, HLA-A31:01 → carbamazepine DRESS, HLA-B*15:02 → SJS/TEN |
| Statin Myopathy | SLCO1B1 poor transport → simvastatin muscle toxicity |
| Chemotherapy Toxicity | DPYD deficiency → fluorouracil/capecitabine, TPMT/NUDT15 → thiopurines, GSTP1 → platinum agents |
| Opioid Response | CYP2D6 poor/ultrarapid → codeine, tramadol, fentanyl dosing |
| Antipsychotic Side Effects | HTR2C → weight gain risk, DRD2 → efficacy, CYP2D6 → metabolism |
| Obesity & Weight Loss | FTO, MC4R, BDNF, PCSK1, GHSR, CLOCK, SIRT1, LYPLAL1 — 9 genes affecting appetite, metabolism, fat distribution |
| Cardiovascular Lipids | APOE, APOA5, APOB, LDLR, LIPC, LPL, SORT1, PNPLA3, TM6SF2 — LDL, HDL, triglycerides, fatty liver |
| Diabetes & Glucose | GCKR, SLC2A2, MTNR1B, PPM1K, PPARG, IL6 — insulin secretion, glucose sensing, BCAA metabolism |
| Salt Sensitivity & Hypertension | GRK4, CLCNKA — sodium handling, diuretic response |
| Circadian & Sleep | PER2, CRY2, NR1D1, CLOCK — chronotype, shift work risk, melatonin response |
| Vitamin Metabolism | MTHFR → folate, DHCR7 → vitamin D synthesis, BCMO1 → beta-carotene, VDR → vitamin D receptor, SLC23A1 → vitamin C, FUT2 → B12 |
| Omega-3 & Fat Absorption | FADS1 → DHA/EPA conversion, FABP2 → dietary fat absorption |
| GLP-1 Drug Response | GLP1R, GIPR → semaglutide/tirzepatide efficacy prediction |
9 MCP Tools
| Tool | Description |
|---|---|
upload_genome |
Parse raw DNA data (23andMe, Genera) |
check_medication |
Smart drug lookup — brand names, generics, typos, categories |
full_pgx_report |
Complete pharmacogenomic report |
supplement_protocol |
Supplement optimization based on gene variants |
compare_medications |
Head-to-head drug comparison |
check_risk |
Check genetic risk for a specific disease |
full_risk_report |
Comprehensive disease risk report |
trait_report |
All genetic traits analysis |
full_report |
Everything combined: medications + risks + traits |
Contributing studies
OpenPGx is designed for open source contribution. Adding a new drug-gene study requires zero code changes — just a JSON file.
How to contribute
- Find a pharmacogenomic study (PubMed, CPIC guidelines, PharmGKB)
- Create a JSON file in
data/pgx/studies/following the naming pattern:{gene}_{year}_{slug}.json - Open a PR
The system automatically:
- Creates the gene if it doesn't exist in the catalog
- Registers all rsIDs for DNA parsing
- Builds the drug-to-gene index
- Makes the interpretations available in all reports
Study file template
{
"gene": "GENE_SYMBOL",
"category": "drug_metabolism",
"gene_description": "What this gene does",
"drugs": ["generic_drug_name"],
"source": {
"pmid": "12345678",
"doi": "10.xxxx/xxxxx",
"source_type": "pubmed",
"title": "Study title",
"journal": "Journal name",
"year": 2024,
"cohort_size": 1000,
"url": "https://pubmed.ncbi.nlm.nih.gov/12345678/",
"finding": "One-line summary of the key finding"
},
"snps": [
{
"rsid": "rs12345",
"risk_allele": "A",
"reference_allele": "G",
"interpretations": {
"AA": { "phenotype": "Poor Metabolizer", "effect": "...", "recommendation": "...", "severity": "severe" },
"AG": { "phenotype": "Intermediate", "effect": "...", "recommendation": "...", "severity": "moderate" },
"GG": { "phenotype": "Normal", "effect": "...", "recommendation": "...", "severity": "info" }
}
}
],
"evidence_level": "established"
}
Full schema: openpgx.schema.json (see $defs/study_contribution)
The OpenPGx Output Standard (v0.4.0)
Patient-facing OpenPGx files are a single JSON object. The only required top-level field is openpgx_version (must be "0.4.0"). Everything else follows openpgx.schema.json.
Top-level structure:
| Field | Required | Role |
|---|---|---|
openpgx_version |
yes | Specification version; const 0.4.0 |
metadata |
no | generated_at, generator, sources; optional last_updated |
provenance |
no | Audit trail: version, previous_version_hash, changelog[] (date, reason, optional description, affected_sections) |
patient |
no | Profile only (no genotypes): id, raw_data_source, raw_data_format, extraction_date, ancestry |
observations |
no | Raw measurements: each item has gene, rsid, genotype; optional chromosome, position, diplotype, activity_score |
medications |
no | Per-drug blocks: drug (name, class, optional brand_names, atc_code, drugbank_id), pgx_associations[], optional interactions[], optional dosing, plus parsed_at, parse_source, confidence (score, evidence_level, …) |
risks |
no | Disease risk: condition, category (enum incl. oncology, cardiovascular, …), overall_risk, risk_snps[], evidence, actionable, recommendation, studies; optional icd10, polygenic_score, lifetime_risk |
traits |
no | trait, category, snps[], your_phenotype, description, evidence, practical_advice, studies |
fhir_mapping |
no | Hints for FHIR: patient_reference, service_request_id, resource_mappings, terminology_systems |
Minimal valid skeleton (only the required field):
{
"openpgx_version": "0.4.0"
}
Example with the main optional sections (names shortened; see schema for full $defs):
{
"openpgx_version": "0.4.0",
"metadata": {
"generated_at": "2026-04-12T12:00:00Z",
"generator": "openpgx-mcp/1.x",
"sources": ["CPIC", "PharmGKB"]
},
"patient": {
"raw_data_source": "23andMe",
"ancestry": "european"
},
"observations": [
{
"gene": "CYP2C19",
"rsid": "rs4244285",
"genotype": "AG"
}
],
"medications": [
{
"drug": { "name": "clopidogrel", "class": "antiplatelet" },
"pgx_associations": [
{
"gene": "CYP2C19",
"rsid": "rs4244285",
"effect": "Reduced active metabolite formation",
"evidence": { "level": "established", "sources": [] },
"clinical_recommendation": "Consider alternative antiplatelet per guideline"
}
],
"parsed_at": "2026-04-12T12:00:00Z",
"parse_source": "cpic_guideline",
"confidence": { "score": 0.9, "evidence_level": "established" }
}
],
"fhir_mapping": {
"resource_mappings": {
"observations": "Observation (code: LOINC 81247-9 'Master HL7 genetic variant reporting panel')",
"medications": "DiagnosticReport (code: LOINC 51969-4 'Genetic analysis report') + MolecularSequence",
"risks": "RiskAssessment (method: LOINC 75321-0 'Clinical genomics report')",
"traits": "Observation (category: genomics)"
}
}
}
Full schema: openpgx.schema.json
Architecture
Raw DNA file (23andMe .txt / Genera .csv)
|
v
+------------------+
| Parser | --> Extracts relevant SNPs from 600K+ raw data
| (local, private) |
+------------------+
|
v
+------------------+
| Study Catalog | --> 118 studies x 109 genes x 219 drugs
| (data/pgx/ | Auto-creates gene definitions
| studies/*.json) | Builds drug-gene index at runtime
+------------------+
|
v
+------------------+
| MCP Tools | --> 9 tools for AI interaction
| (server-core.ts) | Falls back to AI web search when needed
+------------------+
|
v
AI Assistant (Claude, Cursor, ChatGPT, etc.)
All processing happens locally. No data is sent to any server.
Supported raw data formats
- 23andMe (.txt) — fully supported
- Genera (.csv) — fully supported
- AncestryDNA — coming soon
- VCF — coming soon
Smart Drug Resolution
OpenPGx resolves drug names through 6 layers:
- Brand > Generic — "Ozempic" > semaglutide (60+ brands including Venvanse, Rivotril, Marevan, Provigil, Stavigile)
- Generic exact match — "semaglutide" > found
- Fuzzy brand match — "ozmpic" > Ozempic > semaglutide
- Fuzzy generic match — "sertralina" > sertraline
- Category search — "weight loss", "antidepressant", "antidepressivo" > list of drugs
- Semantic search — pre-computed TF-IDF embeddings (289KB)
Development
git clone https://github.com/open-pgx/open-pgx.git
cd open-pgx/mcp-server
npm install
npm run build
npm start
Project structure
mcp-server/
├── src/
│ ├── index.ts # Entry point (stdio transport)
│ ├── server-core.ts # MCP tools (11 tools)
│ ├── pgx-catalog.ts # Study loader, gene index, phenotype inference
│ ├── parsers.ts # 23andMe & Genera raw data parsers
│ ├── pharmacogenes.ts # Barrel re-exports
│ ├── drug-resolver.ts # 6-layer smart drug resolution
│ ├── risk-catalog.ts # Disease risk definitions
│ ├── trait-catalog.ts # Trait definitions
│ └── types.ts # Shared TypeScript interfaces
├── data/
│ ├── pgx/studies/ # 118 study files (the knowledge base)
│ ├── risks/ # Disease risk condition definitions
│ ├── traits/ # Trait definitions
│ ├── drugs_embeddings.json # TF-IDF vectors for semantic search
│ └── tfidf_vocab.json # TF-IDF vocabulary
├── genomes/ # Safe gitignored dir for your DNA files
└── package.json
Privacy & Security
- Your genetic data never leaves your computer
- No cloud upload, no account, no tracking
- Pre-commit hook prevents accidentally committing DNA files
.gitignoreblocks all genetic file patterns- The code is open source — audit it yourself
License
- Specification (OpenPGx format): Apache License 2.0
- Code (MCP server, parsers): MIT License
Disclaimer
OpenPGx is for educational and research purposes only. It does not replace professional medical advice. Always consult your physician before making decisions about medications or supplements. Genetic risk is probabilistic, not deterministic. Evidence levels and source publications are provided for every association.
OpenPGx — The open standard for AI-readable pharmacogenomics.
openpgx.ai · GitHub · npm
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.