LLM Graph Builder MCP

LLM Graph Builder MCP

Enables Claude to automatically extract entities and relationships from URLs, PDFs, and YouTube videos to build structured knowledge graphs in Neo4j. It supports custom schemas, academic citation extraction, and community detection for advanced research and content analysis.

Category
Visit Server

README

LLM Graph Builder MCP

Build knowledge graphs from any URL using Claude Desktop and Neo4j.

What is this?

This Model Context Protocol (MCP) server enables Claude to automatically extract entities and relationships from unstructured text and build knowledge graphs in Neo4j. Simply give Claude a URL (Wikipedia article, PDF, web page, YouTube video) and ask it to build a knowledge graph - it handles the rest.

Perfect for: Research, Zotero integrations, academic papers, content analysis, and building structured knowledge from unstructured sources.

What's Included

This repository is a complete, ready-to-use package containing:

  • llm_graph_builder_mcp/ - The MCP server code
  • llm-graph-builder/ - Neo4j's LLM Graph Builder backend (June 24, 2025, commit 4d7bb5e8)

Both are included so you get a tested, working version out of the box. Just clone once and you're ready to go!

Why include the backend?

  • Guaranteed compatibility - this MCP is tested with this exact backend version
  • Zero configuration headaches - everything just works together
  • If Neo4j updates their backend, you still have a working version

Features

  • Multi-source support: Wikipedia, PDFs, web pages, YouTube videos
  • Academic mode: Extract citations, authors, journals, and bibliographic data
  • Custom schemas: Define allowed entity types and relationships
  • Community detection: Find clusters and groups in your knowledge graph
  • Zero modifications: Works with unmodified llm-graph-builder backend
  • Local processing: Your data, your Neo4j instance, your control

Quick Start

Prerequisites

  1. Neo4j database - Get a free instance at Neo4j AuraDB
    • Create an instance and note your connection URI, username, and password
  2. OpenAI API key - Get one here
  3. Python 3.10+ with uv - Install uv
  4. Claude Desktop - Download here

Step 1: Clone This Repository

# Clone the entire project (includes both MCP and backend)
git clone https://github.com/henrardo/llm-graph-builder-mcp.git
cd llm-graph-builder-mcp

Your directory structure will be:

llm-graph-builder-mcp/           # The MCP server
llm-graph-builder/               # The backend (included)

Step 2: Set Up the Backend

# Navigate to backend
cd llm-graph-builder/backend

# Create environment file
cp example.env .env

Edit .env with your credentials:

# Neo4j Connection (from your AuraDB instance)
NEO4J_URI=neo4j+s://your-instance-id.databases.neo4j.io
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your-auradb-password
NEO4J_DATABASE=neo4j

# OpenAI Configuration
LLM_MODEL_CONFIG_openai_gpt_4.1=gpt-4-turbo-2024-04-09,sk-your-openai-api-key

Install and start the backend:

# Create virtual environment
uv venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install dependencies
uv pip install -r requirements.txt

# Start the backend server
uvicorn score:app --reload --port 8000

Keep this terminal running. The backend must be running for the MCP to work.

Step 3: Install the MCP

Open a new terminal (keep the backend running in the first one):

# Navigate back to the MCP directory
cd llm-graph-builder-mcp

# Install the MCP
uvx --from . llm-graph-builder-mcp

Step 4: Configure Claude Desktop

Edit your Claude Desktop config file:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json

Add this configuration:

{
  "mcpServers": {
    "llm-graph-builder": {
      "command": "uvx",
      "args": [
        "--from",
        "/absolute/path/to/llm-graph-builder-mcp",
        "llm-graph-builder-mcp"
      ],
      "env": {
        "NEO4J_URI": "neo4j+s://your-instance-id.databases.neo4j.io",
        "NEO4J_USERNAME": "neo4j",
        "NEO4J_PASSWORD": "your-auradb-password",
        "NEO4J_DATABASE": "neo4j",
        "GRAPH_BUILDER_URL": "http://localhost:8000"
      }
    }
  }
}

Important:

  • Replace /absolute/path/to/ with the full path to your llm-graph-builder-mcp directory
    • Run pwd in the llm-graph-builder-mcp directory to get this path
    • Example: /Users/yourname/projects/llm-graph-builder-mcp
  • Use the same credentials as in your backend .env file

Step 5: Restart Claude Desktop

Completely quit and restart Claude Desktop for the changes to take effect.

Step 6: Test It

In Claude Desktop, try:

Build a knowledge graph from this Wikipedia article:
https://en.wikipedia.org/wiki/The_Hitchhiker%27s_Guide_to_the_Galaxy

Claude should now use the MCP to build a knowledge graph in your Neo4j database!

Usage Examples

Basic Usage

Build a knowledge graph from this Wikipedia article:
https://en.wikipedia.org/wiki/The_Hitchhiker%27s_Guide_to_the_Galaxy

Academic Papers (with citations)

Build a knowledge graph from this PDF with bibliographic extraction:
https://example.com/research-paper.pdf

Custom Schema

Build a knowledge graph from this article with these entities:
- Nodes: Person, Organization, Location, Event
- Relationships: Person WORKS_FOR Organization, Person ATTENDED Event

https://example.com/article

With Community Detection

Build a knowledge graph from this page and enable community detection:
https://en.wikipedia.org/wiki/Renaissance

Querying the Graph

This MCP builds graphs. To query them, use the separate mcp-neo4j-cypher server.

After building a graph, ask Claude:

"What entities are connected to Arthur Dent?"
"Show me all the citations in my research papers"
"Find communities in the knowledge graph"

Tool Reference

build_knowledge_graph_from_url

Extracts entities and relationships from a URL and builds a knowledge graph.

Parameters:

  • url (required): URL to process (Wikipedia, PDF, web page, YouTube)
  • model (optional): LLM model to use (default: openai_gpt_4.1)
  • allowed_nodes (optional): Comma-separated entity types (e.g., "Person,Organization,Location")
  • allowed_relationships (optional): Relationship triples (e.g., "Person,WORKS_FOR,Organization")
  • enable_communities (optional): Enable community detection (default: false)
  • extract_bibliographic_info (optional): Extract academic citations and references (default: false)

Supported Sources

Type Example Notes
Wikipedia https://en.wikipedia.org/wiki/... Any language supported
PDF URLs https://example.com/paper.pdf Full text extraction
Web pages https://example.com/article Any accessible page
YouTube https://www.youtube.com/watch?v=... Extracts from transcript

Architecture

Claude Desktop
    ↓ MCP Protocol
llm-graph-builder-mcp (this repo)
    ↓ HTTP
llm-graph-builder backend (FastAPI)
    ↓ Cypher
Neo4j Database
    ↑ Cypher Queries
mcp-neo4j-cypher (separate MCP)
    ↑ MCP Protocol
Claude Desktop

Research & Zotero Integration

This MCP is perfect for academic research workflows:

  1. Export PDF URLs from your Zotero library
  2. Ask Claude to process them with bibliographic extraction
  3. Query relationships between papers, authors, and concepts
  4. Discover connections in your research

Example:

"Build knowledge graphs from these papers with bibliographic extraction:
- https://paper1.pdf
- https://paper2.pdf
- https://paper3.pdf

Then show me how they cite each other and what common themes they share."

Backend Version & Updates

This repository includes llm-graph-builder from June 24, 2025 (commit 4d7bb5e8). This version is tested and fully compatible with the MCP.

Using the Included Backend (Recommended)

The included backend is frozen at a known-good version. This ensures:

  • Everything works out of the box
  • No compatibility issues
  • Predictable behavior

Using a Newer Backend Version

If you want to use the latest llm-graph-builder:

# Remove the included backend
rm -rf llm-graph-builder

# Clone the latest version
git clone https://github.com/neo4j-labs/llm-graph-builder.git

# Follow the same setup steps in Step 2

Note: Newer versions should work (the MCP uses standard endpoints), but haven't been tested. If you encounter issues, revert to the included version.

Troubleshooting

Backend won't start

cd llm-graph-builder/backend
source .venv/bin/activate
uvicorn score:app --reload --port 8000

Claude doesn't see the MCP

  1. Check config path is correct (use absolute path, not ~)
  2. Completely quit and restart Claude Desktop (not just close the window)
  3. Check Claude logs: ~/Library/Logs/Claude/mcp*.log (macOS)
  4. Verify the MCP path in config matches your actual directory

"Model not found" error

Make sure your backend .env has:

LLM_MODEL_CONFIG_openai_gpt_4.1=gpt-4-turbo-2024-04-09,YOUR-API-KEY

Backend shows "Connection refused"

  • Ensure the backend is running on port 8000
  • Check GRAPH_BUILDER_URL in Claude config matches the backend URL
  • Backend must be running before you use the MCP

Empty graph / few entities

  • Enable extract_bibliographic_info for academic papers
  • Check OpenAI API key is valid and has credits
  • Verify Neo4j connection in backend .env
  • For PDFs: URL must be directly accessible (no authentication required)

Cache issues after updates

# Clear uvx cache
uv cache clean llm-graph-builder-mcp --force

# Completely quit and restart Claude Desktop

Development

# Install in development mode
git clone https://github.com/henrardo/llm-graph-builder-mcp.git
cd llm-graph-builder-mcp
uv pip install -e .

How It Works

PDF URLs

The MCP automatically detects PDF URLs, downloads them, and uploads to the backend for full-text extraction using PyMuPDF. No binary garbage, just clean text.

Academic Extraction

When extract_bibliographic_info=true, the MCP instructs the LLM to specifically extract:

  • Authors, titles, journals, years, DOIs
  • Citations and references
  • Research concepts and methods
  • Relationships: AUTHORED, CITES, PUBLISHED_IN, DISCUSSES

Schema Specification

Define allowed entities and relationships to guide extraction:

allowed_nodes: "Person,Organization,Product"
allowed_relationships: "Person,FOUNDED,Organization,Organization,PRODUCES,Product"

Zero Backend Modifications

This MCP works with the unmodified llm-graph-builder backend. It uses compatibility tricks (like sending a space character for optional parameters) to work seamlessly with the original code.

Security

Never commit:

  • API keys (OpenAI, etc.)
  • Database passwords
  • Real Neo4j URIs

All credentials should be in .env files or Claude Desktop config (both gitignored).

License

Apache License 2.0 - see LICENSE file for details.

This project includes the Neo4j LLM Graph Builder, which is also licensed under Apache License 2.0.

Contributing

Contributions welcome! This project aims to be a clean wrapper with zero backend modifications required.

Open an issue or pull request on GitHub.

Credits

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
E2B

E2B

Using MCP to run code via e2b.

Official
Featured