MedSci Agent
An open source biomedical research agent that provides LLMs with 28 tools for drug discovery, protein analysis, literature search, medical imaging, omics, and sandbox execution, all running locally via Ollama and MedGemma.
README
<p align="center"> <img src="docs/logo.png" alt="MedSci Agent" width="400" /> </p>
<p align="center">The open source biomedical research agent.</p>
<p align="center"> <a href="https://modelcontextprotocol.io"><img alt="MCP Server" src="https://badge.mcpx.dev?type=server&features=tools" /></a> <a href="https://huggingface.co/google/medgemma-4b-it"><img alt="MedGemma" src="https://img.shields.io/badge/MedGemma-4B-4285F4?style=flat&logo=google&logoColor=white" /></a> <a href="https://huggingface.co/google/txgemma-2b-predict"><img alt="TxGemma" src="https://img.shields.io/badge/TxGemma-2B--predict-4285F4?style=flat&logo=google&logoColor=white" /></a> <a href="https://ollama.com"><img alt="Ollama" src="https://img.shields.io/badge/Ollama-local%20inference-ffffff?style=flat&logo=ollama&logoColor=000000" /></a> <a href="https://opencode.ai"><img alt="OpenCode" src="https://img.shields.io/badge/OpenCode-compatible-000000?style=flat" /></a> <a href="https://bun.sh"><img alt="Bun" src="https://img.shields.io/badge/Bun-%3E%3D1.1-fbf0df?style=flat&logo=bun&logoColor=000000" /></a> <a href="LICENSE"><img alt="License: MIT" src="https://img.shields.io/badge/License-MIT-yellow?style=flat" /></a> </p>
MedSci Agent gives any LLM access to 28 biomedical and execution tools — drug ADMET prediction, protein structure search, single-cell RNA-seq analysis, medical image interpretation, literature search, policy-controlled document acquisition, and isolated sandbox execution — all powered by MedGemma, TxGemma running locally via Ollama, and OpenCode. No data leaves your machine.
Built for the MedGemma Impact Challenge.
Quick Start
Once setup is complete and Ollama is running, open the project in OpenCode and try:
Drug discovery:
Analyze the drug-likeness of ibuprofen (CC(C)Cc1ccc(cc1)C(C)C(=O)O) and predict its ADMET properties.

Single-cell omics:
Read my H5AD file, preprocess it, cluster with Leiden at resolution 0.5, and run differential expression.

Literature search:
Search PubMed for recent papers on CRISPR-Cas9 gene therapy for sickle cell disease.

The Agent automatically selects the right tools, calls MedGemma for interpretation, and returns a synthesized answer.
Architecture
You bring your own LLM. Configure any model in OpenCode (via /model) and it becomes the orchestrator — it reads your query, selects the right tools, calls them through MCP, and synthesizes the results. The MCP servers handle the domain logic underneath.
Cloud LLM (user's choice via OpenCode)
|
| tool calls via MCP
v
+--- MCP Servers (Bun / TypeScript) ---+
| |
| server-drug 5 tools |
| server-protein 5 tools |
| server-literature 4 tools |
| server-acquisition 2 tools |
| server-imaging 1 tool |
| server-omics 5 tools |
| server-paperqa 1 tool |
| server-sandbox 5 tools |
| |
+-------+-------------------+-----------+
| |
v v
Ollama (local) Python Sidecar
- MedGemma 4B - RDKit
- TxGemma 2B - BioPython
- Scanpy
MedGemma interprets tool outputs — it reads raw data from APIs and computational tools, then provides clinically relevant summaries. Every tool that calls MedGemma returns a model_used flag and degrades gracefully if the model is unavailable.
TxGemma predicts ADMET properties (absorption, distribution, metabolism, excretion, toxicity). It runs exact prompt templates from the Therapeutics Data Commons and outputs binary classifications for six safety endpoints.
The Python sidecar is a long-running process that pre-imports scientific libraries and handles requests over stdin/stdout via JSON-RPC. This avoids the 2–5 second startup cost of importing RDKit or Scanpy on every call.
synthesize parameter: Every drug, protein, and omics tool accepts an optional "synthesize": false input field. When false, the tool skips the MedGemma interpretation step and returns raw data immediately. This is useful when you want fast data-only results, or when Ollama is unavailable. Default is true (synthesis enabled).
Tools
Drug Discovery (server-drug)
| Tool | Description | Backend |
|---|---|---|
analyze_molecule |
Physicochemical properties from SMILES (MW, LogP, TPSA, HBD/HBA, rings, formula) | RDKit + MedGemma |
lipinski_filter |
Lipinski Rule of Five drug-likeness check | RDKit |
molecular_similarity |
Tanimoto similarity between two molecules using Morgan fingerprints | RDKit |
predict_admet |
BBB penetration, intestinal absorption, hERG blocking, CYP3A4 inhibition, Ames mutagenicity, DILI risk | TxGemma + RDKit + MedGemma |
search_chembl |
Search ChEMBL for bioactive molecules and targets | ChEMBL API + MedGemma |
Protein Analysis (server-protein)
| Tool | Description | Backend |
|---|---|---|
parse_fasta |
Parse FASTA files, return sequence metadata | BioPython |
analyze_sequence |
Sequence length, composition, molecular weight | BioPython + MedGemma |
search_uniprot |
Search UniProt by gene, protein name, or accession | UniProt API + MedGemma |
search_pdb |
Search PDB for 3D structures by protein or PDB ID | RCSB PDB API + MedGemma |
predict_structure |
Retrieve AlphaFold predicted structure and confidence scores | AlphaFold DB API + MedGemma |
Literature, Acquisition & Synthesis (server-literature, server-acquisition & server-paperqa)
| Tool | Description | Backend |
|---|---|---|
search_pubmed |
Search PubMed with Boolean and MeSH queries | NCBI E-utilities + MedGemma |
fetch_abstract |
Fetch full abstract and metadata by PMID | NCBI E-utilities + MedGemma |
search_openalex |
Search OpenAlex for scholarly works, citations, open access status | OpenAlex API + MedGemma |
search_clinical_trials |
Search ClinicalTrials.gov by condition, drug, or intervention | ClinicalTrials.gov API + MedGemma |
resolve_identifier_to_sources |
Resolve DOI/PMID/PMCID into candidate source URLs with provenance and confidence | NCBI ID Converter + deterministic transforms |
acquire_documents |
Policy-controlled document retrieval from DOI/PMID/PMCID/URL with content-level labeling and explicit extraction backend metadata (pmid/pmcid: BioC-first with Scrapling fallback, doi/url: Scrapling-primary) |
NCBI BioC + Scrapling sidecar + safety policy engine |
search_and_analyze |
Deep semantic synthesis of up to 10 identifiers/documents (internal NCBI acquisition or pre-acquired documents) using contextual LLM re-ranking | PaperQA2 + Tantivy |
Medical Imaging (server-imaging)
| Tool | Description | Backend |
|---|---|---|
analyze_medical_image |
Analyze X-ray, CT, pathology, or dermatology images (PNG/JPEG, max 50 MB) | MedGemma (multimodal) |
Omics (server-omics)
| Tool | Description | Backend |
|---|---|---|
read_h5ad |
Load H5AD file, return observation and variable metadata | Scanpy |
preprocess_omics |
Filter, normalize, log-transform, find highly variable genes | Scanpy |
cluster_cells |
Leiden or Louvain clustering with UMAP coordinates | Scanpy |
differential_expression |
Differential expression between groups (Wilcoxon, t-test, logreg) | Scanpy + MedGemma |
gene_set_enrichment |
Pathway enrichment against MSigDB, GO, KEGG via Enrichr | Enrichr API + MedGemma |
Sandbox Execution (server-sandbox)
| Tool | Description | Backend |
|---|---|---|
sandbox_prepare |
Create/reuse a Docker sandbox with default network_policy=deny |
Docker Sandbox CLI |
sandbox_run_job |
Execute command in sandbox with deterministic timeout + logs | Docker Sandbox CLI |
sandbox_status |
Check sandbox state (running/stopped/unknown) with retry/backoff |
Docker Sandbox CLI |
sandbox_fetch_artifact |
Read artifact/log content with size/path safety constraints | Host file access + safety checks |
sandbox_teardown |
Stop or remove sandbox | Docker Sandbox CLI |
Setup
Fresh install — complete sequence
# 1. Install Bun and uv (if not already installed)
curl -fsSL https://bun.sh/install | bash
curl -LsSf https://astral.sh/uv/install.sh | sh
# 2. Clone and install JS/TS dependencies
git clone https://github.com/omar-A-hassan/medsci-agent.git
cd medsci-agent
bun install
# 3. Create Python environments
bun run setup:py:core # RDKit + BioPython + Scanpy → .venv/
bun run setup:py:paperqa # PaperQA + Scrapling + ACE → packages/server-paperqa/.venv-paperqa/
# 4. Pull Ollama models (Ollama must be installed first: https://ollama.com)
ollama pull medgemma:latest
ollama pull mxbai-embed-large
ollama pull hf.co/matrixportalx/txgemma-2b-predict-GGUF:Q4_K_M
# 5. (Optional) Export API keys for cloud features — do this BEFORE starting OpenCode
export OPENROUTER_API_KEY="sk-or-..." # needed for ACE MCP and cloud PaperQA
export ANTHROPIC_API_KEY="sk-ant-..." # only if using Anthropic directly
# 6. Verify setup
bun run typecheck
bun test
# 7. Open in OpenCode
opencode # MCP servers start automatically via opencode.json
The opencode.json is pre-configured for local-only use (Ollama + local Python). No additional configuration is needed for the core tools. See the sections below for cloud model setup and optional features.
Prerequisites
- Bun >= 1.1
- uv (Python package and environment manager)
- Ollama
- OpenCode
- Docker Desktop 4.40+ with the Docker AI Sandbox feature enabled (required for sandbox tools only)
- The
sandbox_*tools usedocker sandboxsubcommands — a first-party feature in Docker Desktop 4.40+. Enable it under Settings → Features in Development → Docker AI Sandbox. - All other tools work without Docker.
- The
Sandbox not available? If you don't have Docker Desktop 4.40+ you can still use all 23 non-sandbox tools. Leave the
medsci-sandboxMCP server enabled — calls to sandbox tools will return clear errors rather than crashing.
1. Clone and install
git clone https://github.com/omar-A-hassan/medsci-agent.git
cd medsci-agent
bun install
2. Python environments
The system uses two strictly decoupled Python virtual environments, both managed with uv.
Core Environment (Required):
bun run setup:py:core
PaperQA + Acquisition Environment (Required for strict full-text retrieval and deep synthesis):
bun run setup:py:paperqa
Or run both in one command:
bun run setup:py:all
Important: Set
MEDSCI_PYTHONto.venv/bin/python3for core tools. Bothmedsci-paperqaandmedsci-acquisitionshould run withpackages/server-paperqa/.venv-paperqa/bin/python3to keep Scrapling and PaperQA dependencies in one environment.Dependency ownership:
.venv(uv groups:scientific,test): core science sidecars (RDKit/BioPython/Scanpy and related tool stacks).packages/server-paperqa/.venv-paperqa(uv groups:paperqa,ace,test): PaperQA + acquisition parsing + ACE stack.
3. Pull Ollama models
ollama pull medgemma:latest # Biomedical interpretation
ollama pull mxbai-embed-large # Document embeddings for PaperQA
ollama pull hf.co/matrixportalx/txgemma-2b-predict-GGUF:Q4_K_M # ADMET prediction
If medgemma:latest is not available directly, pull the GGUF from HuggingFace and alias it:
ollama pull hf.co/unsloth/medgemma-4b-it-GGUF:Q4_K_M
cp ~/.ollama/models/manifests/hf.co/unsloth/medgemma-4b-it-GGUF/Q4_K_M \
~/.ollama/models/manifests/registry.ollama.ai/library/medgemma/latest
Note: The
cpcommand creates an alias so the code can reference the model asmedgemma:latestregardless of how it was downloaded.
4. Configure OpenCode
The included opencode.json is pre-configured. Set the model field to your preferred cloud LLM:
{
"model": "openai/gpt-4o"
}
Update the MEDSCI_PYTHON path in each server's environment block if your virtual environment is in a different location.
4.1 Optional: Enable ACE MCP (self-improvement layer)
MedSci now supports optional ACE MCP integration for adaptive strategy learning.
Install ACE MCP in the PaperQA/acquisition virtual environment (recommended):
bun run setup:py:paperqa
Configured MCP server name: ace-mcp (in opencode.json)
ACE model default: openrouter/arcee-ai/trinity-large-preview:free (set via ACE_MCP_DEFAULT_MODEL in opencode.json). This is a free OpenRouter model that requires OPENROUTER_API_KEY. To change:
export ACE_MCP_DEFAULT_MODEL="your/provider-model"
Credential model for ACE:
- ACE runs as a separate MCP process and needs provider credentials available in that process environment.
- If you use OpenAI for ACE, export
OPENAI_API_KEYbefore starting OpenCode. - If you use Anthropic/Gemini/Mistral/etc for ACE, set the corresponding provider key env var.
- ACE can use the same cloud provider/model family you use in OpenCode, but it does not automatically inherit the currently selected chat model unless you configure
ACE_MCP_DEFAULT_MODELaccordingly. - ACE in this setup is intentionally decoupled from local MedSci models (
medgemma,txgemma, PaperQA embedding model) and should not use them unless you explicitly reconfigure it.
Default behavior in MedSci:
- in-run reflection is read-only (
ace.ask/ace.skillbook.get) - automatic post-run learning is triggered by plugin guardrails (gated by evidence quality)
- manual fallback remains available via
/ace-learn
Skillbooks are persisted under .opencode/ace/skillbooks/.
Why ACE is installed in packages/server-paperqa/.venv-paperqa (not the core .venv):
- ACE learning is orchestration/synthesis logic, closest to the PaperQA + acquisition stack.
- Keeping ACE out of the core scientific sidecar avoids dependency drift for RDKit/Scanpy/BioPython workflows.
- This preserves strict environment separation and reduces breakage risk in domain toolchains.
Example override:
export ACE_MCP_DEFAULT_MODEL="anthropic/claude-3-5-sonnet-latest"
export ANTHROPIC_API_KEY="..."
Key export timing: API keys must be exported in the shell before starting OpenCode. MCP server processes are spawned at startup and inherit env from that moment — changing env vars afterwards has no effect until OpenCode is restarted.
5. Run tests
bun run test:all
This runs:
- all Bun tests across MCP servers (including sidecar-in-loop tests when
packages/server-paperqa/.venv-paperqaexists), - and PaperQA Python tests using the PaperQA venv interpreter.
6. Start
Make sure Ollama is running, then open the project directory in OpenCode. The MCP servers start automatically.
OpenCode Agent Usage
Agent roles
medsciis the primary orchestrator for cross-domain workflows.drug,protein,omics, andimagingare focused subagents for domain-deep tasks.- All agents follow a planning-first contract (brief plan before first tool call, then strict sequential execution).
Built-in project commands
This repo ships OpenCode slash commands under .opencode/commands:
/triage— classify request and propose minimal sequential MCP plan/lit-deep— literature discovery + PaperQA deep synthesis workflow/sandbox-job— sandbox lifecycle run (prepare → run_job → status(advisory) → fetch → teardown)/qc-check— maker-checker quality gate on scientific outputs/handoff-report— compact handoff for next agent/human/ace-learn— manual ACE learning fallback/override for a completed task/ace-strategies— inspect ACE strategy state for a domain/session
Sandbox command default
For inline script execution, prefer:
python3 -c "print('sandbox smoke ok')"
instead of python -c ... to avoid environment-path variability across sandbox templates.
Determinism and status policy
- Treat
sandbox_run_jobsuccess/failure as source-of-truth for execution outcome. - Treat
sandbox_statusas advisory state signal. - Backend applies status retry/backoff before returning
unknown.
Configuration
Environment variables (set in opencode.json under each server's environment):
| Variable | Default | Description |
|---|---|---|
MEDSCI_PROFILE |
standard |
Hardware profile: lite, standard, or full |
MEDSCI_PYTHON |
python3 |
Path to Python binary (use .venv/bin/python3 for the virtual environment) |
MEDSCI_OLLAMA_URL |
http://127.0.0.1:11434 |
Ollama API endpoint |
MEDSCI_OLLAMA_MODEL |
medgemma:latest |
Default Ollama model for interpretation |
MEDSCI_OLLAMA_TIMEOUT |
120000 |
Ollama request timeout in milliseconds |
MEDSCI_PYTHON_TIMEOUT |
60000 |
Python sidecar request timeout in milliseconds |
PQA_LLM_MODEL |
ollama/medgemma:latest |
LLM model PaperQA uses for summarization/answering (litellm format) |
PQA_EMBEDDING_MODEL |
ollama/mxbai-embed-large |
Embedding model PaperQA uses for document indexing |
PQA_OLLAMA_URL |
http://localhost:11434 |
Ollama endpoint for PaperQA (separate from core) |
PQA_EMAIL |
medsci-agent@localhost |
Email for NCBI API access (required by NCBI usage policy) |
PQA_USE_DOC_DETAILS |
false |
If true, enables PaperQA metadata inference during indexing (less reliable with local models) |
PQA_CHUNK_CHARS |
1200 |
Default chunk size (characters) for PaperQA document reader before embedding |
PQA_CHUNK_OVERLAP |
100 |
Overlap (characters) between adjacent chunks |
PQA_CHUNK_MIN_CHARS |
400 |
Lower bound for automatic chunk-size backoff retries on embedding context errors |
PQA_CHUNK_BACKOFF_RETRIES |
3 |
Number of times to retry indexing with smaller chunks on embed context-limit errors |
PQA_ACQUIRE_CONCURRENCY |
3 |
Max concurrent NCBI acquisition requests in PaperQA |
PQA_MAX_TEXT_CHARS |
1500000 |
Hard cap on acquired paper text size (chars) to avoid indexing blowups |
PQA_NEGATIVE_CACHE_TTL_HOURS |
24 |
TTL for acquisition negative-cache entries (failed sources) |
PQA_LLM_TIMEOUT_SECONDS |
180 |
Timeout (seconds) for LLM and summary LLM requests via LiteLLM. Sidecar timeout auto-adjusts above this. |
PQA_ANSWER_MAX_SOURCES |
5 |
Maximum number of sources PaperQA includes in a synthesized answer |
PQA_EVIDENCE_K |
10 |
Number of evidence chunks PaperQA gathers before answering |
PQA_DOCSET_CACHE_MAX_ENTRIES |
8 |
Max number of in-memory docset cache entries per workspace |
PQA_DOCSET_CACHE_MAX_BYTES |
209715200 |
Max total bytes for in-memory docset cache (default 200 MB) |
PQA_PREFLIGHT_CACHE_TTL_SECONDS |
300 |
TTL for cached Ollama preflight checks to avoid repeated startup latency |
PQA_SKIP_PREFLIGHT |
false |
If true, skip Ollama model reachability/model-presence preflight checks |
PQA_ENABLE_DOCUMENT_INPUT |
true |
If false, rejects documents payloads and only allows identifier-based acquisition |
PQA_LLM_BACKEND |
ollama |
LLM backend for PaperQA: ollama (local), openrouter, or anthropic (cloud). Cloud backends skip Ollama preflight entirely. |
PQA_CLOUD_MODEL |
(backend default) | Cloud model name in LiteLLM format when PQA_LLM_BACKEND is openrouter or anthropic (e.g. openrouter/anthropic/claude-3-5-sonnet, anthropic/claude-opus-4-5) |
PQA_QUERY_TIMEOUT_MS |
(auto-calculated) | Direct override for PaperQA total query timeout in milliseconds (caps at 540,000 ms). Overrides the automatic per-paper calculation. |
Acquisition-related environment variables (optional, server-acquisition):
| Variable | Default | Description |
|---|---|---|
ACQ_REQUIRE_SCRAPLING |
false |
If true, acquisition sidecar health check and HTML extraction hard-fail when Scrapling is unavailable. Default false enables graceful fallback to BeautifulSoup/regex. |
ACQ_PYTHON |
unset | Optional explicit Python binary for acquisition sidecar (defaults to PaperQA venv path) |
ACQ_CACHE_DIR |
.opencode/acquired_docs |
Cache location for acquired document payloads |
ACQ_MAX_BYTES_HARD_CAP |
10000000 |
Hard upper bound on response body bytes regardless of per-call options |
Sandbox-related environment variables (optional, server-sandbox):
| Variable | Default | Description |
|---|---|---|
MEDSCI_SANDBOX_DEFAULT_TEMPLATE |
unset | Optional default Docker sandbox template |
MEDSCI_SANDBOX_PULL_TEMPLATE |
missing |
Template pull policy: missing, always, never |
MEDSCI_SANDBOX_ARTIFACT_ROOT |
sandbox-artifacts |
Artifact root directory for run logs/metadata |
MEDSCI_SANDBOX_DEFAULT_TIMEOUT_SEC |
600 |
Default run timeout (seconds) |
MEDSCI_SANDBOX_MAX_TIMEOUT_SEC |
3600 |
Max allowed run timeout (seconds) |
MEDSCI_SANDBOX_STATUS_RETRY_ATTEMPTS |
2 |
Number of status retries before returning unknown |
MEDSCI_SANDBOX_STATUS_RETRY_BACKOFF_MS |
1000 |
Milliseconds between status retries |
OpenCode policy hardening (included)
opencode.json includes baseline hardening:
edit,task,skill,webfetch,external_directory,doom_loopdefault toaskbashallowlist for safe read-only/dev commandsbashdenylist for destructive actions (rm *,git push*)- agent-specific task/skill permission scoping for
medsci
Plugin guardrails (included)
Project plugin: .opencode/plugins/medsci-guardrails.ts
- Blocks reading sensitive
.envfiles via thereadtool (allows.env.example) - Logs tool telemetry (
tool,session_id,duration_ms,failed) viaclient.app.log - Adds ACE telemetry tags (
ace_tool,ace_learning_write) for auditability - Triggers automatic post-run
/ace-learncommand execution when high-signal tool evidence is present
The MEDSCI_PROFILE setting controls which Python libraries are pre-imported when the sidecar starts. All tools work regardless of profile — the sidecar imports libraries lazily on first use — but pre-importing avoids a cold-start delay on the first call.
| Profile | Pre-imported | Use case |
|---|---|---|
lite |
RDKit | Drug discovery tools only, lower memory usage |
standard |
RDKit, Scanpy, BioPython, leidenalg, igraph, pynndescent | Most workflows |
full |
All available | Fastest first-call latency across all tools |
PaperQA Data Nuances
When querying papers via server-paperqa, the tool performs a multi-step pipeline:
- Text Acquisition — Full-text articles are acquired via the NCBI BioC PMC API (DOIs, PMIDs, or PMCIDs). Papers not in PMC Open Access fall back to abstract-only via the BioC PubMed API. Acquired text is cached as
.txtfiles in.opencode/pqa_papers/.search_and_analyzealso accepts pre-acquireddocumentspayloads. When provided, PaperQA skips internal acquisition and indexes those documents directly.
- Indexing — PaperQA2 indexes the text files and builds a search index in
.opencode/pqa_index/. - RAG Synthesis — The query runs against indexed chunks using Ollama for both summarization and final answer generation.
- Models (local): PaperQA uses
PQA_LLM_MODEL(default:ollama/medgemma:latest) for summarization and answering, andPQA_EMBEDDING_MODEL(default:ollama/mxbai-embed-large) for document embeddings. Both run locally via Ollama. - Models (cloud): Set
PQA_LLM_BACKEND=openrouterorPQA_LLM_BACKEND=anthropicto route the LLM calls to a cloud provider instead of Ollama. The embedding model always stays local. Export the corresponding API key before starting OpenCode:Cloud backends skip the Ollama LLM preflight check and tend to be significantly faster for synthesis. The acquisition and embedding steps still run locally.# OpenRouter (supports many models) export PQA_LLM_BACKEND=openrouter export PQA_CLOUD_MODEL="openrouter/anthropic/claude-3-5-sonnet" export OPENROUTER_API_KEY="sk-or-..." # Anthropic direct export PQA_LLM_BACKEND=anthropic export PQA_CLOUD_MODEL="anthropic/claude-opus-4-5" export ANTHROPIC_API_KEY="sk-ant-..." - Preflight: The sidecar verifies Ollama reachability and required model presence before indexing/query. Failures return structured codes (
OLLAMA_UNREACHABLE,MODEL_NOT_FOUND). - Agent Type: Uses
"fake"agent mode (deterministic search → gather evidence → answer path) rather than an LLM-driven agent, which reduces token usage. - Paper Limits: The
search_and_analyzeschema strictly bounds processing to 10 inputs per call (papers or documents) to avoid Out-of-Memory crashes. - Document Provenance Contract: Pre-acquired documents can include
extraction_backend(scrapling,beautifulsoup,regex,pdf_text,plain_text) andfallback_usedso downstream synthesis can reason about evidence quality explicitly. - PMC Open Access Coverage: Full-text retrieval requires papers to be in the PMC Open Access subset (~3.5M articles). Papers outside this subset get abstract-only indexing — the response includes an
acquisition_summaryshowing which papers were full-text vs abstract-only. - Stage-Gated Responses:
search_and_analyzenow returnsstage_status(acquire,index,query) and fail-soft terminal codes (ACQUIRE_NONE_SUCCESS,INDEX_ZERO_SUCCESS,API_AUTH_FAILED) instead of opaque failures.API_AUTH_FAILEDmeans the cloud LLM API key was missing or rejected — checkOPENROUTER_API_KEY/ANTHROPIC_API_KEY. - Embedding Context Guardrail: Indexing now uses conservative chunk defaults and automatic chunk-size backoff retries when Ollama embedding rejects large inputs (
/api/embedcontext-limit errors). - Caching: Paper acquisition uses canonicalized identifier hashing + manifests. Index/query path includes in-memory docset cache and persisted manifests under
.opencode/pqa_index/manifest.json. - Stateful Indexes: Do not manually delete
.opencode/pqa_index/or.opencode/pqa_papers/while a PaperQA query is in progress.
Troubleshooting
OLLAMA_UNREACHABLE — PaperQA or MedGemma fails to connect
Ollama must be running before OpenCode starts. MCP server processes inherit the environment at startup — run ollama serve first, then open OpenCode. Verify with:
curl http://localhost:11434/api/tags
MODEL_NOT_FOUND — Model missing
Pull the required model:
ollama pull medgemma:latest
ollama pull mxbai-embed-large
ollama pull hf.co/matrixportalx/txgemma-2b-predict-GGUF:Q4_K_M
If medgemma:latest resolves but the sidecar still says MODEL_NOT_FOUND, the PQA_LLM_MODEL env var may reference a different name than what Ollama has. Check with ollama list.
API_AUTH_FAILED — Cloud PaperQA authentication error
The API key for PQA_LLM_BACKEND was not found or is invalid. Export it in the shell before starting OpenCode:
export OPENROUTER_API_KEY="sk-or-..." # for PQA_LLM_BACKEND=openrouter
export ANTHROPIC_API_KEY="sk-ant-..." # for PQA_LLM_BACKEND=anthropic
ACQUIRE_NONE_SUCCESS — PaperQA couldn't fetch any papers
The papers may not be in PMC Open Access. PaperQA falls back to abstract-only indexing for papers outside PMC OA (~3.5M articles). If you have access to the full text, use acquire_documents first to get pre-acquired documents, then pass them directly to search_and_analyze.
Python sidecar tools fail (drug/protein/omics)
The core Python environment may not be set up:
bun run setup:py:core # installs RDKit, BioPython, Scanpy into .venv/
Then verify MEDSCI_PYTHON in opencode.json points to .venv/bin/python3.
Scrapling warning in acquisition
If you see SCRAPLING_REQUIRED in logs, the acquisition sidecar is running without Scrapling. This is normal — ACQ_REQUIRE_SCRAPLING=false (the default) lets it fall back to BeautifulSoup/regex HTML extraction. To enable Scrapling:
bun run setup:py:paperqa # Scrapling is installed in the PaperQA venv
And set MEDSCI_PYTHON for medsci-acquisition in opencode.json to packages/server-paperqa/.venv-paperqa/bin/python3 (it already is by default).
Sandbox tools fail with docker: command not found or connection errors
The sandbox tools require Docker Desktop 4.40+ with the AI Sandbox feature enabled. If Docker Desktop isn't installed or the daemon isn't running, sandbox calls return errors — all other tools continue to work normally.
ACE MCP fails to start
Verify the PaperQA venv is set up (bun run setup:py:paperqa) and that OPENROUTER_API_KEY is exported before starting OpenCode (ACE uses OpenRouter by default).
Project Structure
medsci-agent/
packages/
core/ Shared library (Ollama client, Python sidecar, config, types)
python/
sidecar.py Long-running Python process with scientific library handlers
server-drug/ Drug discovery MCP server
server-protein/ Protein analysis MCP server
server-literature/ Literature search MCP server
server-acquisition/ Policy-controlled retrieval MCP server
server-imaging/ Medical imaging MCP server
server-omics/ Single-cell and omics MCP server
server-paperqa/ Deep literature synthesis MCP server (identifier or document ingestion)
server-sandbox/ Isolated sandbox execution MCP server
.opencode/
agents/ Agent definitions (orchestrator + 4 domain specialists + PaperQA routing)
commands/ Reusable slash commands for MedSci workflows
plugins/ OpenCode plugin guardrails and telemetry hooks
skills/ Skill definitions for OpenCode
opencode.json OpenCode configuration (model, MCP servers)
Runtime Artifacts
Sandbox runs generate runtime artifacts (for debugging and reproducibility), typically under sandbox-artifacts/ or a configured artifact root:
job-*/stdout.logjob-*/stderr.logjob-*/metadata.json
These are not source files and are safe to delete. Sandbox teardown removes containers; host-side artifacts remain unless explicitly pruned.
Acknowledgments
This project builds on the following work:
- Prompt Repetition (Leviathan, Kalman & Matias, 2025) — Our
interpretWithMedGemmafunction repeats the instruction after the data context to improve non-reasoning model accuracy. - TxGemma (Wang, Schmidgall, Jaeger et al., 2025) — ADMET prediction uses verbatim prompt templates from Google's TxGemma, trained on Therapeutics Data Commons benchmarks.
- PaperQA2 (Future-House/paper-qa) — Deep literature synthesis and RAG is powered by FutureHouse's PaperQA2 library, with text acquired via NCBI's BioC API.
- Agentic Context Engine (ACE) (kayba-ai/agentic-context-engine) — ACE MCP integration powers MedSci's strategy-learning and post-run self-improvement loop.
- K-Dense Scientific Skills (K-Dense-AI/claude-scientific-skills) — The OpenCode skills in this project are adapted from K-Dense AI's open-source collection of 147+ scientific skills for AI agents.
License
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
