ScholarMCP
An MCP server that enables coding agents to search academic papers, ingest full-text PDFs, extract structured details, and manage citations in literature research workflows.
README

ScholarMCP
ScholarMCP is an MCP server for literature research workflows in coding agents. Official documentation: https://scholar-mcp.lstudlo.com/
Early Development Notice
This project is still in early development, and rough edges or bugs may occur. If you run into a problem, please open an issue and include:
- the agent used
- screenshots, if applicable
- steps to reproduce the issue
ScholarMCP gives your agent tools to:
- search papers across multiple sources
- ingest and parse full-text PDFs
- extract structured paper details
- suggest citations and build references
- validate manuscript citations
ScholarMCP is for...
Use this if you want Claude Code, Codex, or any MCP-compatible coding agent to run research tasks directly from chat.
Quick Start
1. Prerequisites
- Node.js
>=20 npm(for install/publish)pnpm(for contributors working from source)
2. Install as an npm package (recommended)
npm install -g scholar-mcp
One-off run without global install:
npx -y scholar-mcp --transport=stdio
Install from GitHub Packages (scoped mirror package):
npm install -g @lstudlo/scholar-mcp --registry=https://npm.pkg.github.com
3. Run
Stdio mode:
scholar-mcp --transport=stdio
HTTP mode:
scholar-mcp --transport=http
Health check (HTTP mode):
curl http://127.0.0.1:3000/health
4. Run from source (contributors)
pnpm install
pnpm dev:stdio
Use with Coding Agents
ScholarMCP works best over stdio for local coding agents. The docs site has full step-by-step guides for Claude Code, OpenAI Codex, and OpenCode. Anthropic officially documents claude mcp add ... -- <command>, and OpenAI officially documents codex mcp add ...; the short forms below keep those CLI flows as the primary setup path.
Shared environment values used below:
SCHOLAR_MCP_TRANSPORT=stdio
SCHOLAR_REQUEST_DELAY_MS=350
RESEARCH_ALLOW_REMOTE_PDFS=true
RESEARCH_ALLOW_LOCAL_PDFS=true
Claude Code
Add with the Claude CLI:
claude mcp add -s user \
--transport stdio \
-e SCHOLAR_MCP_TRANSPORT=stdio \
-e SCHOLAR_REQUEST_DELAY_MS=350 \
-e RESEARCH_ALLOW_REMOTE_PDFS=true \
-e RESEARCH_ALLOW_LOCAL_PDFS=true \
scholar_mcp -- npx -y scholar-mcp --transport=stdio
Verify:
claude mcp get scholar_mcp
Manual fallback:
- add
scholar_mcpundermcpServersin~/.claude.json - use project-local
.mcp.jsonif you want the config scoped to the repo - keep the
--separator in the CLI form; Claude needs it to stop parsing flags
OpenAI Codex
Add with the Codex CLI:
codex mcp add scholar_mcp \
--env SCHOLAR_MCP_TRANSPORT=stdio \
--env SCHOLAR_REQUEST_DELAY_MS=350 \
--env RESEARCH_ALLOW_REMOTE_PDFS=true \
--env RESEARCH_ALLOW_LOCAL_PDFS=true \
-- npx -y scholar-mcp --transport=stdio
Verify:
codex mcp list
codex mcp get scholar_mcp --json
Manual fallback:
- add the server to
~/.codex/config.tomlunder[mcp_servers.scholar_mcp] - Codex CLI and the Codex app share that MCP config model
OpenCode
Add with the OpenCode CLI:
opencode mcp add
Recommended interactive values:
- name:
scholar_mcp - type:
local - command:
npx -y scholar-mcp --transport=stdio - enabled:
true - env: use the four shared variables above
Verify:
opencode mcp list
Manual fallback:
- add the server to
~/.config/opencode/opencode.json - use
"type": "local"and a command array like["npx", "-y", "scholar-mcp", "--transport=stdio"]
Run from source
If you are developing ScholarMCP locally, use this launcher instead of npx -y scholar-mcp --transport=stdio:
pnpm --filter scholar-mcp dev:stdio
Use the same environment values shown above in whichever client you register.
Generic MCP clients
stdiocommand:scholar-mcp --transport=stdio- Or:
npx -y scholar-mcp --transport=stdio
- HTTP endpoint:
- Start server with
SCHOLAR_MCP_TRANSPORT=http scholar-mcp - Connect client to
http://127.0.0.1:3000/mcp - Optional auth: set
SCHOLAR_MCP_API_KEYand send bearer auth header from your client
- Start server with
MCP Tools
| Tool | Purpose |
|---|---|
search_literature_graph |
Federated search over OpenAlex/Crossref/Semantic Scholar (+ optional scholar scrape). |
search_google_scholar_key_words |
Keyword search on Google Scholar. |
search_google_scholar_advanced |
Scholar search with author/year/phrase filters. |
get_author_info |
Resolve author profile and top publications. |
ingest_paper_fulltext |
Start async full-text ingestion from DOI/URL/PDF/local path. |
get_ingestion_status |
Poll ingestion job status and parsed summary. |
extract_granular_paper_details |
Extract methods, claims, datasets, metrics, and references. |
suggest_contextual_citations |
Suggest citations from manuscript context. |
build_reference_list |
Generate formatted bibliography and BibTeX. |
validate_manuscript_citations |
Detect missing/uncited/duplicate citation issues. |
Example Agent Prompts
- "Find 10 recent papers on retrieval-augmented generation and summarize methods and datasets."
- "Ingest full text for DOI
10.1038/s41467-024-55563-6, then extract claims and limitations." - "Given this draft section, suggest citations in IEEE style and generate BibTeX."
- "Validate my manuscript citations against this reference list and show missing citations."
Configuration
Most users only need these:
SCHOLAR_MCP_TRANSPORT:stdio|http|both(default:stdio)SCHOLAR_REQUEST_DELAY_MS: request pacing to reduce rate-limit risk (default:250)RESEARCH_ALLOW_REMOTE_PDFS: allow remote PDF downloads for ingestion (default:true)RESEARCH_ALLOW_LOCAL_PDFS: allow local PDF ingestion (default:true)SCHOLAR_MCP_API_KEY: optional bearer token for HTTP modeRESEARCH_GROBID_URL: optional GROBID endpoint
The CLI loads .env from the current working directory automatically at startup.
Advanced options exist in src/config.ts for timeouts, retries, HTTP session capacity/TTL, provider tuning, and cache behavior.
Troubleshooting
Invalid environment variable formatinclaude mcp add:- Add
--before the MCP server name (see Claude setup command above).
- Add
Unable to resolve a downloadable PDF URL from inputon DOI ingestion:- The DOI and landing page may not expose an accessible PDF URL.
- Retry with
pdf_url(direct PDF) orlocal_pdf_path.
- Too many Scholar failures or throttling:
- Increase
SCHOLAR_REQUEST_DELAY_MS(for example500to1000).
- Increase
Usage Notes
Google Scholar may throttle automated traffic. Use conservative request pacing, respect provider terms, and avoid abusive query patterns.
Publishing
Releases publish to two registries:
- npm:
scholar-mcpvia.github/workflows/publish.yml - GitHub Packages:
@lstudlo/scholar-mcpvia.github/workflows/publish-github-packages.yml
Release with a minimal command set:
- Validate release readiness:
pnpm release:check - Cut and publish a release:
pnpm release(patch),pnpm release minor, orpnpm release major - Start from a clean git working tree (no unstaged/staged/untracked files).
- The release command runs checks, bumps
packages/scholar-mcp/package.json, creates a release commit/tag, pushes branch/tag, then creates a GitHub Release. - GitHub Actions publishes to npm and GitHub Packages from that release tag.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.