arxiv-mcp

arxiv-mcp

A streamlined MCP server that connects AI assistants to arXiv's vast collection of academic papers, enabling search, retrieval, and analysis of research papers.

Category
Visit Server

README

<div align="center"> <h1> arXiv MCP Server </h1>

Python MCP Compatible arXiv API License Code Quality CI/CD Docker

</div>

Access the world's largest repository of academic papers through the Model Context Protocol

A streamlined Model Context Protocol server that connects AI assistants to arXiv's vast collection of academic papers. Search, analyze, and download research papers directly from your AI workflow.

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.12+
  • uv package manager

Installation

Option 1: Docker (Recommended)

# Pull and run the Docker image
docker run --rm -it ghcr.io/tejas242/arxiv-mcp:latest

# Or using docker-compose
git clone https://github.com/tejas242/arxiv-mcp.git
cd arxiv-mcp
docker compose up

Option 2: Local Development

# Clone and setup
git clone https://github.com/tejas242/arxiv-mcp.git
cd arxiv-mcp
uv sync

# Test the server
uv run main.py

๐Ÿ› ๏ธ Available Functions

<div align="center">

Function Status Description Parameters
search_papers โœ… Working Search arXiv papers with flexible query syntax query, max_results, sort_by, sort_order
get_paper_details โœ… Working Retrieve complete metadata for any arXiv paper arxiv_id
build_advanced_query โœ… Working Construct complex search queries with multiple fields title_keywords, author_name, category, abstract_keywords
get_arxiv_categories โœ… Working List all available arXiv subject categories None
search_by_author โš ๏ธ Limited Find papers by specific author (use search_papers instead) author_name, max_results
search_by_category โš ๏ธ Limited Browse papers by category (use search_papers instead) category, max_results
download_paper_pdf ๐Ÿ”ง Needs Fix Download paper PDFs (redirect handling issue) arxiv_id, save_path

</div>

Function Details

โœ… Fully Working Functions

search_papers - The primary search function

  • Supports full arXiv query syntax
  • Handles keywords, authors, categories, titles
  • Configurable sorting and pagination
  • Returns formatted results with abstracts and links

get_paper_details - Detailed paper information

  • Complete metadata extraction
  • Author information with affiliations
  • Category classifications and links
  • Publication dates and updates

build_advanced_query - Query construction helper

  • Combines multiple search criteria
  • Supports title, author, category, and abstract searches
  • Returns properly formatted query strings

get_arxiv_categories - Category reference

  • Complete list of arXiv subject categories
  • Descriptions for each category
  • Helpful for constructing targeted searches

โš ๏ธ Limited Functions (Workarounds Available)

search_by_author - Use search_papers('au:"Author Name"') instead search_by_category - Use search_papers('cat:category_code') instead

๐Ÿ”ง Functions Needing Fixes

download_paper_pdf - HTTP redirect handling needs improvement

  • Currently fails due to HTTPS/HTTP redirect issues
  • PDFs can be accessed directly via the links provided in search results

โš™๏ธ Configuration

Claude Desktop Setup

<details> <summary><strong>Configuration Instructions</strong></summary>

For Local Installation:

Add to your Claude Desktop config file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "arxiv-mcp": {
      "command": "uv",
      "args": [
        "--directory",
        "/absolute/path/to/arxiv-mcp",
        "run",
        "main.py"
      ]
    }
  }
}

For Docker Installation:

{
  "mcpServers": {
    "arxiv-mcp": {
      "command": "docker",
      "args": [
        "run",
        "--rm",
        "-i",
        "ghcr.io/tejas242/arxiv-mcp:latest"
      ]
    }
  }
}

</details>

VS Code MCP Extension

<details> <summary><strong>VS Code Configuration</strong></summary>

{
  "mcp": {
    "servers": {
      "arxiv-mcp": {
        "command": "uv",
        "args": ["--directory", "/path/to/arxiv-mcp", "run", "main.py"]
      }
    }
  }
}

</details>

๐Ÿ’ก Usage Examples

Core Search Operations

# Search for papers about transformers
search_papers("transformer architecture")

# Advanced query with specific fields
search_papers('ti:"attention mechanism" AND cat:cs.LG')

# Author-specific search (recommended approach)
search_papers('au:"Geoffrey Hinton"')

# Category browsing (recommended approach)
search_papers('cat:cs.AI')

Research Workflow

# 1. Find the famous "Attention" paper
search_papers('ti:"Attention Is All You Need"')
get_paper_details("1706.03762")

# 2. Explore related work
search_papers("transformer neural networks")

# 3. Build complex queries
query = build_advanced_query(
    title_keywords="few-shot learning",
    author_name="Tom Brown",
    category="cs.LG"
)
search_papers(query)

๐Ÿ“Š arXiv Categories Reference

<details> <summary><strong>Popular Categories</strong></summary>

Code Description Example Topics
cs.AI Artificial Intelligence Machine learning, neural networks, AI theory
cs.LG Machine Learning Deep learning, reinforcement learning, statistical learning
cs.CV Computer Vision Image processing, object detection, visual recognition
cs.CL Computation and Language NLP, language models, text processing
cs.CR Cryptography and Security Security protocols, encryption, privacy
stat.ML Machine Learning (Statistics) Statistical learning theory, Bayesian methods
physics.gen-ph General Physics Theoretical physics, quantum mechanics
math.NA Numerical Analysis Computational mathematics, algorithms
q-bio.NC Quantitative Biology Neuroscience, computational biology

</details>

Use get_arxiv_categories() for the complete list of available categories.

๐Ÿงช Testing Results

Based on comprehensive testing of all functions:

<div align="center">

Working Functions Limited Functions Needs Fix

</div>

โœ… Reliable Functions

  • Paper search with keywords, authors, categories: 100% success rate
  • Paper detail retrieval: Complete metadata extraction working
  • Query construction: All syntax combinations supported
  • Category listing: All arXiv categories accessible

โš ๏ธ Alternative Approaches Recommended

  • Author search: Use search_papers('au:"Author Name"') instead of search_by_author()
  • Category browsing: Use search_papers('cat:category') instead of search_by_category()

๐Ÿ”ง Known Issues

  • PDF downloads: Redirect handling needs improvement (PDFs accessible via direct links)

๐Ÿ”ง Development

Project Structure

arxiv-mcp/
โ”œโ”€โ”€ src/arxiv_mcp/          # Main package
โ”‚   โ”œโ”€โ”€ server.py           # MCP server implementation
โ”‚   โ”œโ”€โ”€ arxiv_client.py     # arXiv API wrapper
โ”‚   โ”œโ”€โ”€ models.py           # Pydantic data models
โ”‚   โ””โ”€โ”€ utils.py            # Helper functions
โ”œโ”€โ”€ tests/                  # Test suite
โ”œโ”€โ”€ main.py                 # Entry point
โ””โ”€โ”€ pyproject.toml         # Project config

Running Tests

uv run pytest tests/ -v

Debug Mode

# Enable detailed logging
PYTHONPATH=src uv run python -c "
import logging
logging.basicConfig(level=logging.DEBUG)
from arxiv_mcp.server import main
main()
"

โš ๏ธ Troubleshooting

<details> <summary><strong>Common Issues & Solutions</strong></summary>

Server Not Detected

  • โœ… Verify absolute paths in MCP config
  • โœ… Test server runs: uv run main.py
  • โœ… Restart Claude Desktop after config changes

Search Issues

  • โœ… Use arXiv query syntax (see examples above)
  • โœ… Check category names: get_arxiv_categories()
  • โœ… Try broader search terms
  • โœ… Use search_papers() instead of specific search functions

PDF Download Failures

  • โœ… Access PDFs via links in search results
  • โœ… Check internet connection
  • โœ… Verify arXiv ID format (e.g., "1706.03762")

</details>

๐Ÿ™ Acknowledgments


<div align="center">

GitHub Issues Contribute

<br><br>

Made with โšก by screenager

</div>

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured