Wikidata MCP Server
Connects LLMs to Wikidata's structured knowledge base using a hybrid architecture that optimizes for both fast entity searches and complex relational queries. It provides tools for entity and property retrieval, metadata lookups, and direct SPARQL execution to ground AI responses in verified data.
README
Wikidata MCP Server - Optimized Hybrid Architecture
A Model Context Protocol (MCP) server with Server-Sent Events (SSE) transport that connects Large Language Models to Wikidata's structured knowledge base. Features an optimized hybrid architecture that balances speed, accuracy, and verifiability by using fast basic tools for simple queries and advanced orchestration only for complex temporal/relational queries.
Architecture Highlights
- 🚀 Fast Basic Tools: 140-250ms for simple entity/property searches
- 🧠 Advanced Orchestration: 1-11s for complex temporal queries (when needed)
- ⚡ 50x Performance Difference: Empirically measured and optimized
- 🔄 Hybrid Approach: Right tool for each query type
- 🛡️ Graceful Degradation: Works with or without Vector DB API key
MCP Tools
Basic Tools (Fast & Reliable)
search_wikidata_entity: Find entities by name (140-250ms)search_wikidata_property: Find properties by name (~200ms)get_wikidata_metadata: Entity labels, descriptions (~200ms)get_wikidata_properties: All entity properties (~200ms)execute_wikidata_sparql: Direct SPARQL queries (~200ms)
Advanced Tool (Complex Queries)
query_wikidata_complex: Temporal/relational queries (1-11s)- ✅ "last 3 popes", "recent presidents of France"
- ❌ Simple entity searches (use basic tools instead)
Live Demo
The server is deployed and accessible at:
- URL: https://wikidata-mcp-mirror.onrender.com
- MCP Endpoint: https://wikidata-mcp-mirror.onrender.com/mcp
- Health Check: https://wikidata-mcp-mirror.onrender.com/health
Usage with Claude Desktop
To use this server with Claude Desktop:
-
Install mcp-remote (if not already installed):
npm install -g @modelcontextprotocol/mcp-remote -
Edit the Claude Desktop configuration file located at:
~/Library/Application Support/Claude/claude_desktop_config.json -
Configure it to use the remote MCP server:
{ "mcpServers": { "Wikidata MCP": { "command": "npx", "args": [ "mcp-remote", "https://wikidata-mcp-mirror.onrender.com/mcp" ] } } } -
Restart Claude Desktop
-
When using Claude, you can now access Wikidata knowledge through the configured MCP server.
Deployment
Deploying to Render
- Create a new Web Service in your Render dashboard
- Connect your GitHub repository
- Configure the service:
- Build Command:
pip install -e . - Start Command:
python -m wikidata_mcp.api
- Build Command:
- Set Environment Variables:
- Add all variables from
.env.example - For production, set
DEBUG=false - Make sure to set a proper
WIKIDATA_VECTORDB_API_KEY
- Add all variables from
- Deploy
The service will be available at https://your-service-name.onrender.com
Environment Setup
Prerequisites
- Python 3.10+
- Virtual environment tool (venv, conda, etc.)
- Vector DB API key (for enhanced semantic search)
Environment Variables
Create a .env file in the project root with the following variables:
# Required for Vector DB integration
1. Clone the repository:
```bash
git clone https://github.com/yourusername/wikidata-mcp-mirror.git
cd wikidata-mcp-mirror
-
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: .\venv\Scripts\activate -
Install the required dependencies:
pip install -e . -
Create a
.envfile based on.env.exampleand configure your environment variables:cp .env.example .env # Edit .env with your configuration -
Run the application:
# Development python -m wikidata_mcp.api # Production (with Gunicorn) gunicorn --bind 0.0.0.0:8000 --workers 4 --timeout 120 --keep-alive 5 --worker-class uvicorn.workers.UvicornWorker wikidata_mcp.api:appThe server will start on
http://localhost:8000by default with the following endpoints:GET /health- Health checkGET /messages/- SSE endpoint for MCP communicationGET /docs- Interactive API documentation (if enabled)GET /metrics- Prometheus metrics (if enabled)
Environment Variables
| Variable | Default | Description |
|---|---|---|
PORT |
8000 | Port to run the server on |
WORKERS |
4 | Number of worker processes |
TIMEOUT |
120 | Worker timeout in seconds |
KEEPALIVE |
5 | Keep-alive timeout in seconds |
DEBUG |
false | Enable debug mode |
LOG_LEVEL |
INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
USE_VECTOR_DB |
true | Enable/disable vector DB integration |
USE_CACHE |
true | Enable/disable caching system |
USE_FEEDBACK |
true | Enable/disable feedback system |
CACHE_TTL_SECONDS |
3600 | Cache time-to-live in seconds |
CACHE_MAX_SIZE |
1000 | Maximum number of items in cache |
WIKIDATA_VECTORDB_API_KEY |
API key for the vector DB service |
Running with Docker
-
Build the Docker image:
docker build -t wikidata-mcp . -
Run the container:
docker run -p 8000:8000 --env-file .env wikidata-mcp
Running with Docker Compose
-
Start the application:
docker-compose up --build -
For production, use the production compose file:
docker-compose -f docker-compose.prod.yml up --build -d
Monitoring
The service exposes Prometheus metrics at /metrics when the PROMETHEUS_METRICS environment variable is set to true.
Health Check
curl http://localhost:8000/health
Metrics
curl http://localhost:8000/metrics
Testing
Running Tests
Run the test suite with:
# Run all tests
pytest
# Run specific test file
pytest tests/orchestration/test_query_orchestrator.py -v
# Run with coverage report
pytest --cov=wikidata_mcp tests/
Integration Tests
To test the Vector DB integration, you'll need to set the WIKIDATA_VECTORDB_API_KEY environment variable:
WIKIDATA_VECTORDB_API_KEY=your_key_here pytest tests/orchestration/test_vectordb_integration.py -v
Test Client
You can also test the server using the included test client:
python test_mcp_client.py
Or manually with curl:
# Connect to SSE endpoint
curl -N -H "Accept: text/event-stream" https://wikidata-mcp-mirror.onrender.com/messages/
# Send a message (replace SESSION_ID with the one received from the SSE endpoint)
curl -X POST "https://wikidata-mcp-mirror.onrender.com/messages/?session_id=YOUR_SESSION_ID" \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test-client","version":"0.1.0"}},"id":0}'
Deployment on Render.com
This server is configured for deployment on Render.com using the render.yaml file.
Deployment Configuration
- Build Command:
pip install -r requirements.txt - Start Command:
gunicorn -k uvicorn.workers.UvicornWorker server_sse:app - Environment Variables:
PORT: 10000
- Health Check Path:
/health
Docker Support
The repository includes a Dockerfile that's used by Render.com for containerized deployment. This allows the server to run in a consistent environment with all dependencies properly installed.
How to Deploy
- Fork or clone this repository to your GitHub account
- Create a new Web Service on Render.com
- Connect your GitHub repository
- Render will automatically detect the
render.yamlfile and configure the deployment - Click "Create Web Service"
After deployment, you can access your server at the URL provided by Render.com.
Architecture
The server is built using:
- FastAPI: For handling HTTP requests and routing
- SSE Transport: For bidirectional communication with clients
- MCP Framework: For implementing the Model Context Protocol
- Wikidata API: For accessing Wikidata's knowledge base
Key Components
server_sse.py: Main server implementation with SSE transportwikidata_api.py: Functions for interacting with Wikidata's API and SPARQL endpointrequirements.txt: Dependencies for the projectDockerfile: Container configuration for Docker deployment on Renderrender.yaml: Configuration for deployment on Render.comtest_mcp_client.py: Test client for verifying server functionality
Available MCP Tools
The server provides the following MCP tools:
search_wikidata_entity: Search for entities by namesearch_wikidata_property: Search for properties by nameget_wikidata_metadata: Get entity metadata (label, description)get_wikidata_properties: Get all properties for an entityexecute_wikidata_sparql: Execute a SPARQL queryfind_entity_facts: Search for an entity and find its factsget_related_entities: Find entities related to a given entity
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Based on the Model Context Protocol (MCP) specification
- Uses Wikidata as the knowledge source
- Inspired by the MCP examples from the official documentation
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.