LitSynth MCP Server
Enables intelligent academic paper discovery through ArXiv search and AI-powered semantic similarity analysis. Helps researchers find and rank the most relevant papers using natural language queries and state-of-the-art sentence transformers.
README
LitSynth MCP Server
<p align="center"> <picture style="display:block; margin: 0 auto; width: 400px;"> <source srcset="public/logo-dark.png" media="(prefers-color-scheme: dark)"> <img src="public/logo-light.png" alt="AI Research Assistant Logo" style="display:block; margin: 0 auto"> </picture> </p>
A Model Context Protocol (MCP) server for intelligent academic paper discovery and semantic search using ArXiv. This server provides tools for searching academic papers and performing semantic similarity analysis using state-of-the-art sentence transformers.
Features
- ArXiv Search: Query ArXiv database with automatic URL encoding for complex search terms
- Semantic Search: Find papers most relevant to your research using AI-powered semantic similarity
- Robust Error Handling: Graceful handling of network issues and malformed data
- Flexible Input: Support for various query formats including spaces and special characters
Tools Available
1. greet(name: str)
Simple greeting function for testing server connectivity.
Parameters:
name: String - Name to greet
Returns: Greeting message
2. search_query_arxiv(query: str, max_results: int = 5)
Search ArXiv database for academic papers matching your query.
Parameters:
query: String - Search terms (automatically URL encoded)max_results: Integer - Maximum number of results to return (default: 5)
Returns: Structured response with papers including:
- Title
- Authors
- Summary/Abstract
- ArXiv link
- Status message
Example:
search_query_arxiv("multimodal agents", 3)
3. search_semantic_arxiv(query: str, papers: list, top_k: int = 5)
Perform semantic search on a list of papers to find the most relevant ones.
Parameters:
query: String - Research query for semantic matchingpapers: List - Papers to search through (fromsearch_query_arxivor manual list)top_k: Integer - Number of most relevant papers to return (default: 5)
Returns: Ranked papers with similarity scores including:
- Title
- Summary
- Authors
- ArXiv link
- Similarity score (0-1)
Example:
papers = search_query_arxiv("machine learning")
relevant = search_semantic_arxiv("deep reinforcement learning", papers, 3)
Installation
Prerequisites
- Python 3.8+
- pip
Setup
-
Clone or download the project files
-
Install dependencies:
pip install -r requirements.txt
- Run the MCP server:
python my_server.py
Dependencies
The project requires the following packages (see requirements.txt):
fastmcp>=0.1.0- MCP frameworkfeedparser>=6.0.10- RSS/Atom feed parsing for ArXiv APIrequests>=2.31.0- HTTP requestssentence-transformers>=2.2.2- Semantic search and embeddingstorch>=2.0.0- PyTorch for neural networkstransformers>=4.21.0- Hugging Face transformersnumpy>=1.21.0- Numerical computing
Project Structure
ai-research-assistant/
├── my_server.py # Main MCP server implementation
├── requirements.txt # Python dependencies
└── README.md # This file
Usage Examples
Basic ArXiv Search
Search for papers on a specific topic:
# Search for papers about transformers
results = search_query_arxiv("attention mechanisms transformers", 5)
Semantic Paper Discovery
Find the most relevant papers from a search result:
# First, get papers on a broad topic
papers = search_query_arxiv("artificial intelligence", 20)
# Then find the most relevant ones for your specific research
relevant_papers = search_semantic_arxiv("graph neural networks", papers, 5)
Handling Complex Queries
The server automatically handles special characters and spaces:
# These work automatically without manual encoding
search_query_arxiv("machine learning & deep learning: survey")
search_query_arxiv("reinforcement learning (RL) applications")
Technical Details
Semantic Search Model
The server uses the sentence-transformers/all-MiniLM-L6-v2 model for semantic embeddings. This model:
- Provides 384-dimensional sentence embeddings
- Balances speed and accuracy
- Works well for academic text similarity
Error Handling
The server includes comprehensive error handling:
-
URL Encoding: Automatic handling of spaces and special characters
-
Network Errors: Graceful degradation when ArXiv is unavailable
-
Data Validation: Safe handling of missing or malformed paper data
-
Empty Results: Informative messages when no papers are found
Response Format
All functions return structured responses:
{
"message": "Status or info message",
"results": [
{
"title": "Paper Title",
"author": ["Author 1", "Author 2"],
"summary": "Abstract text...",
"link": "https://arxiv.org/abs/...",
"similarity_score": 0.85 // Only in semantic search
}
]
}
Troubleshooting
Common Issues
"URL can't contain control characters" error:
- This is fixed in the current version with automatic URL encoding
- Make sure you're using the latest version of the server
"No papers found" result:
- Check your query spelling
- Try broader search terms
- Verify ArXiv service availability
Slow semantic search:
- First run downloads the transformer model (~90MB)
- Subsequent runs are much faster
- Consider reducing
top_kfor faster results
Memory issues:
- The sentence transformer model requires ~500MB RAM
- Reduce batch sizes if experiencing memory problems
Contributing
Feel free to submit issues, feature requests, or pull requests to improve the AI Research Assistant.
License
This project is open source. Please check individual dependency licenses for commercial use.
Acknowledgments
- ArXiv for providing free access to academic papers
- Sentence Transformers for semantic search capabilities
- FastMCP for the MCP server framework
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.