MCP Servers

Google Scholar MCP Server

Enables academic research through Google Scholar by searching for papers, finding author publications, discovering recent research, and identifying highly cited works through web scraping with natural language queries.

README

🔬 Google Scholar MCP Server

A Model Context Protocol (MCP) server that provides access to Google Scholar for academic research through web scraping. This server enables you to search for papers, find author publications, discover recent research, and identify highly cited works.

✨ Features

🔍 Paper Search: Search Google Scholar for academic papers with flexible filtering
👨‍🔬 Author Research: Find papers by specific authors
📅 Recent Papers: Discover recent publications in any field
🏆 Highly Cited Papers: Find influential papers with citation filtering
⏱️ Rate Limiting: Respectful scraping with built-in delays
🛡️ Error Handling: Robust error handling and logging
🌐 Local Web Interface: Optional Flask web interface for testing
🧠 Smart Query Processing: Natural language query processing with AI integration

🚀 Quick Start

Prerequisites

Python 3.8 or higher
pip (Python package manager)

Installation

Clone the repository

git clone https://github.com/yourusername/google-scholar-mcp.git
cd google-scholar-mcp

Install dependencies
```
pip install -r requirements.txt
```

Optional: Set up environment variables

cp env.example .env
# Edit .env with your preferred settings

Running the MCP Server

Run the MCP server for use with MCP clients:

python main.py

Testing with Local Web Interface

For testing and development, you can run the local web interface:

python local_server.py

Then open your browser to http://localhost:5000

🔧 Configuration

The server can be configured through environment variables. Copy env.example to .env and modify as needed:

# Request delay between Google Scholar requests (seconds)
REQUEST_DELAY=5

# Maximum results per request
MAX_RESULTS_PER_REQUEST=20

# HTTP timeout (seconds)
TIMEOUT=15

Available Tools

1. search_papers

Search for academic papers on Google Scholar.

Parameters:

query (required): Search query for papers
num_results (optional): Number of results to return (1-20, default: 10)
start_year (optional): Earliest publication year to include
end_year (optional): Latest publication year to include

Example:

{
  "query": "machine learning neural networks",
  "num_results": 15,
  "start_year": 2020,
  "end_year": 2024
}

2. get_author_papers

Search for papers by a specific author.

Parameters:

author_name (required): Name of the author to search for
num_results (optional): Number of results to return (default: 10)

Example:

{
  "author_name": "Geoffrey Hinton",
  "num_results": 20
}

3. search_recent_papers

Search for recent papers in a specific field.

Parameters:

field (required): Research field or topic
years_back (optional): How many years back to search (1-10, default: 2)
num_results (optional): Number of results to return (default: 10)

Example:

{
  "field": "quantum computing",
  "years_back": 3,
  "num_results": 15
}

4. get_highly_cited_papers

Search for highly cited papers in a topic.

Parameters:

topic (required): Research topic or field
min_citations (optional): Minimum number of citations (default: 100)
num_results (optional): Number of results to return (default: 10)

Example:

{
  "topic": "transformer neural networks",
  "min_citations": 500,
  "num_results": 10
}

Response Format

Each tool returns a JSON response with paper information including:

title: Paper title
authors: Author names
url: Link to the paper
year: Publication year
snippet: Paper abstract/description snippet
cited_by: Number of citations (when available)
pdf_url: Direct PDF link (when available)
publication_info: Journal/conference information

Rate Limiting and Ethics

This server implements respectful scraping practices:

2-second delays between requests
Proper User-Agent headers
Error handling for rate limits
Designed for research and educational purposes

🔍 MCP Client Integration

Claude Desktop

Add this to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "google-scholar": {
      "command": "python",
      "args": ["/path/to/google-scholar-mcp/main.py"],
      "cwd": "/path/to/google-scholar-mcp"
    }
  }
}

Other MCP Clients

The server follows the standard MCP protocol and should work with any MCP-compatible client.

🧠 Smart Query Processing

The server includes intelligent query processing that can understand natural language requests:

# Example natural language queries:
"Find recent computer vision papers from CVPR 2023"
"Show me highly cited papers by Geoffrey Hinton"
"What are the latest developments in quantum computing?"

📊 Response Format

All tools return structured JSON with paper information:

{
  "title": "Paper Title",
  "authors": "Author Names",
  "url": "Link to paper",
  "year": 2023,
  "snippet": "Abstract excerpt...",
  "cited_by": 150,
  "pdf_url": "Direct PDF link",
  "publication_info": "Journal/Conference"
}

⚖️ Legal and Ethical Considerations

🎓 Educational Use: This tool is intended for research and educational purposes
📜 Terms of Service: Respect Google Scholar's terms of service
🤝 Responsible Use: Use responsibly and avoid excessive requests
🔌 Official APIs: Consider using official APIs when available
📚 Copyright: Be mindful of copyright and fair use policies

🔧 Troubleshooting

Common Issues

Rate Limiting: If you get blocked, wait and reduce request frequency
Network Errors: Check your internet connection
Parsing Errors: Google Scholar may change their HTML structure
Import Errors: Make sure all dependencies are installed

Debug Mode

Enable debug logging by setting DEBUG=true in your .env file.

Logging

The server includes detailed logging. Check the console output for error messages and debugging information.

📦 Dependencies

mcp: Model Context Protocol library
requests: HTTP library for web scraping
beautifulsoup4: HTML parsing
lxml: XML/HTML parser
urllib3: HTTP client
flask: Web interface (optional)

🤝 Contributing

Contributions are welcome! Please ensure:

Respectful scraping practices
Error handling for edge cases
Clear documentation
Testing with various queries
Follow the existing code style

Development Setup

# Clone the repository
git clone https://github.com/yourusername/google-scholar-mcp.git
cd google-scholar-mcp

# Install dependencies
pip install -r requirements.txt

# Run tests
python test_server.py
python test_query_processor.py

# Run local development server
python local_server.py

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built on the Model Context Protocol by Anthropic
Inspired by the need for accessible academic research tools
Thanks to the open-source community for the excellent libraries used

⚠️ Disclaimer

This tool is for educational and research purposes. Please respect Google Scholar's terms of service and use responsibly. The authors are not responsible for any misuse of this tool.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

E2B

Using MCP to run code via e2b.

Official

Featured