MCP Servers

Research Paper Ingestion MCP Server

Enables searching, downloading, and analyzing academic papers from arXiv and Semantic Scholar to extract key insights and citation metrics. It facilitates autonomous knowledge acquisition by processing research findings and integrating them into persistent AI memory systems.

README

Research Paper Ingestion MCP Server

Autonomous knowledge acquisition from academic research papers for AGI self-improvement.

Part of the Agentic System - a 24/7 autonomous AI framework with persistent memory.

Features

Paper Discovery

arXiv Integration: Search and download from arXiv.org
Semantic Scholar: Citation analysis and academic impact metrics
PDF Download: Automatic paper retrieval and storage

Knowledge Extraction

Insight Extraction: Identify key findings and contributions
Citation Analysis: Understand paper influence and relationships
Technique Identification: Extract novel methods and approaches

Memory Integration

Enhanced Memory: Store extracted knowledge for AGI learning
Structured Entities: Create searchable memory representations
Citation Graphs: Track knowledge lineage

Installation

cd ${AGENTIC_SYSTEM_PATH:-/opt/agentic}/agentic-system/mcp-servers/research-paper-mcp
pip install -r requirements.txt

Configuration

Add to ~/.claude.json:

{
  "mcpServers": {
    "research-paper-mcp": {
      "command": "python3",
      "args": [
        "${AGENTIC_SYSTEM_PATH:-/opt/agentic}/agentic-system/mcp-servers/research-paper-mcp/server.py"
      ],
      "env": {},
      "disabled": false
    }
  }
}

Available Tools

search_arxiv

Search arXiv for research papers by query.

Parameters:

query (required): Search query (e.g., "recursive self-improvement AGI")
max_results: Maximum results (default: 10)
sort_by: Sort order - relevance, lastUpdatedDate, submittedDate

Example:

results = mcp__research-paper-mcp__search_arxiv({
    "query": "meta-learning neural networks",
    "max_results": 20,
    "sort_by": "relevance"
})

search_semantic_scholar

Search Semantic Scholar for papers with citation metrics.

Parameters:

query (required): Search query
fields: Metadata fields to retrieve
limit: Maximum results (default: 10)

Example:

results = mcp__research-paper-mcp__search_semantic_scholar({
    "query": "transformer architecture attention",
    "fields": ["title", "authors", "citationCount", "year"],
    "limit": 15
})

download_paper

Download research paper PDF from URL.

Parameters:

url (required): PDF URL
paper_id (required): Unique identifier for filename

Example:

result = mcp__research-paper-mcp__download_paper({
    "url": "https://arxiv.org/pdf/1234.5678.pdf",
    "paper_id": "arxiv-1234.5678"
})

extract_insights

Extract key insights and findings from paper text.

Parameters:

paper_text (required): Full paper text or abstract
focus_areas: Optional specific areas to focus on

Example:

insights = mcp__research-paper-mcp__extract_insights({
    "paper_text": paper_abstract,
    "focus_areas": ["methodology", "results"]
})

analyze_citations

Analyze citation relationships and paper influence.

Parameters:

paper_id (required): Semantic Scholar or arXiv paper ID
depth: Citation graph depth 1-3 (default: 1)

Example:

analysis = mcp__research-paper-mcp__analyze_citations({
    "paper_id": "arxiv:1706.03762",  # "Attention Is All You Need"
    "depth": 2
})

store_paper_knowledge

Store extracted knowledge in enhanced-memory for AGI learning.

Parameters:

paper_metadata (required): Paper metadata dict
insights (required): List of key insights
techniques: List of novel techniques

Example:

stored = mcp__research-paper-mcp__store_paper_knowledge({
    "paper_metadata": {
        "id": "arxiv-1234.5678",
        "title": "Novel AGI Approach",
        "authors": ["Smith", "Jones"],
        "year": 2024
    },
    "insights": [
        "Achieves 95% accuracy on benchmark",
        "10x faster than previous methods"
    ],
    "techniques": [
        "Recursive meta-optimization",
        "Self-modifying architectures"
    ]
})

Usage Patterns

Autonomous Research Workflow

# 1. Search for relevant papers
arxiv_results = mcp__research-paper-mcp__search_arxiv({
    "query": "recursive self-improvement",
    "max_results": 10
})

# 2. Get citation metrics
for paper in arxiv_results['papers']:
    scholar_data = mcp__research-paper-mcp__search_semantic_scholar({
        "query": paper['title'],
        "limit": 1
    })

    # 3. Download high-impact papers
    if scholar_data['papers'][0]['citationCount'] > 50:
        pdf = mcp__research-paper-mcp__download_paper({
            "url": paper['pdf_url'],
            "paper_id": paper['id']
        })

        # 4. Extract and store insights
        insights = mcp__research-paper-mcp__extract_insights({
            "paper_text": paper['abstract']
        })

        mcp__research-paper-mcp__store_paper_knowledge({
            "paper_metadata": paper,
            "insights": insights['insights']
        })

Citation Network Analysis

# Analyze citation influence
analysis = mcp__research-paper-mcp__analyze_citations({
    "paper_id": "influential-paper-id",
    "depth": 2
})

# Identify most influential papers in field
if analysis['citation_graph']['influential_citations'] > 100:
    # Download and study this foundational paper
    pass

Storage

Papers Directory: ${AGENTIC_SYSTEM_PATH:-/opt/agentic}/agentic-system/research-papers/
PDFs: Saved as {paper_id}.pdf
Memory Integration: Via enhanced-memory-mcp create_entities

Dependencies

arxiv: arXiv API Python wrapper
aiohttp: Async HTTP client for Semantic Scholar API
mcp: Model Context Protocol SDK

Future Enhancements

PDF Text Extraction: Parse full paper text from PDFs
Figure/Diagram Analysis: Extract visual insights
Code Repository Links: Find implementation code
Related Papers: Automatic discovery of connected research
Trend Detection: Identify emerging research directions
LLM-Powered Insight Extraction: Use GPT-4 for deeper analysis

Integration with AGI System

This MCP server closes Gap #1 from AGI_GAP_ANALYSIS.md:

Knowledge Acquisition Infrastructure ✅

✓ Research Paper Ingestion (arXiv + Semantic Scholar)
⏳ Video Transcript Processing (separate MCP)
⏳ GitHub Repository Analysis (future)
⏳ Documentation Scraping (future)
⏳ Knowledge Graph Integration (future)

Impact: System can now autonomously learn from the latest AI research!

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured