World Bank Documents MCP Server
Enables discovery and retrieval of World Bank reports and publications through the Documents & Reports API. It supports full-text search, structured filtering by topic or country, and metadata extraction for research and data analysis.
README
World Bank Documents MCP Agent
A structured MCP-based document discovery system for the World Bank Documents & Reports API, built with:
- FastMCP for exposing tools as an MCP server
- LangGraph / LangChain for agent orchestration
- Groq LLM for reasoning and tool selection
- World Bank Documents API for search, filtering, metadata retrieval, and facet discovery
This project lets an LLM-powered agent query World Bank reports using a local MCP server over stdio, making it useful for interview demos, MCP experimentation, and building intelligent research assistants.
๐๏ธ Architecture
User Question
โ
LangGraph / LangChain Agent
โ
MultiServerMCPClient
โ
FastMCP Server (stdio)
โ
World Bank MCP Tools
โโโ search_documents
โโโ filter_documents
โโโ get_document
โโโ get_facets
โ
World Bank Documents API
Component Overview
-
Client Agent (
client/agent_client.py)- Accepts user questions
- Connects to the MCP server through
MultiServerMCPClient - Lets the LLM decide which tool to call
- Produces a final natural-language answer
-
MCP Server (
server/wb_docs_mcp.py)- Exposes World Bank document tools over stdio
- Validates and normalizes tool inputs
- Wraps World Bank API operations in MCP-compatible functions
-
Tool Layer (
server/tools/)-
Handles all World Bank API interactions
-
Builds and normalizes query parameters
-
Uses a shared API client for requests
-
Extracts structured results (documents, facets)
-
Returns consistent responses using
safe_resultandsafe_error -
Supports filtering, search, pagination, and sorting
-
search.pyโ Full-text search with fallback + sorting -
filter.pyโ Structured filtering (country, topic, date) -
document.pyโ Retrieve document by ID -
facets.pyโ Discover valid filter values
-
-
Schema Layer (
server/schemas.py)- Defines structured input models for all MCP tools
- Keeps the interface clean and validation consistent
๐ Features
- โ MCP Server exposing structured tools
- โ Integration with World Bank public API (no authentication required)
- โ Agentic client with multi-step tool execution
- โ Natural language query support
- โ Pydantic validation for robust inputs
- โ Error handling & pagination support
๐งฉ Project Structure
WorldBank_MCP/
โโโ server/
โ โโโ __init__.py
โ โโโ wb_docs_mcp.py # MCP server with tools
โ โโโ schemas.py # Pydantic input schemas
โ โโโ config.py # constants & settings
โ โโโ utils.py # helper logic
โ โโโ api_client.py # API request implementation
โ โโโ tools/
โ โโโ __init__.py
โ โโโ search.py # search_documents implementation
โ โโโ filter.py # filter_documents implementation
โ โโโ document.py # get_document implementation
โ โโโ facets.py # get_facets implementation
โ
โโโ client/
โ โโโ __init__.py
โ โโโ agent_client.py # Agent loop (LLM + MCP tools)
โ
โโโ requirements.txt # Python dependencies
โโโ .gitignore
โโโ .env.example # Environment variables template
โโโ README.md
๐ง MCP Tools
1. search_documents
Use this as the first tool for most natural-language questions.
Best for:
- broad topic search
- unknown filter values
- initial exploration
Example:
- โclimate resilience in Kenyaโ
- โeducation financing in Africaโ
- โwater security in Brazilโ
2. filter_documents
Use this when exact structured filters are already known.
Supported filters:
count_exacttopic_exactdocty_exactstrdateenddate
Best for:
- narrowing results after discovery
- exact country/topic/type filtering
- time-bounded report lookup
3. get_document
Retrieves full metadata for a specific document by its World Bank document ID.
Best for:
- inspecting a selected result
- retrieving richer metadata
- building a final grounded answer
4. get_facets
Returns valid exact values for supported filter fields.
Supported facet fields:
docty_exactlang_exactcount_exacttopic_exact
Best for:
- discovering valid country names
- discovering exact topic values
- avoiding guessed filters
๐ค LLM Choice
We use Groq (llama-3.1-8b-instant) because:
- โก Very fast inference
- ๐ฐ Free tier available
- ๐ง Supports tool/function calling natively
- ๐ No local GPU required
โก Quick Start
Prerequisites
- Python 3.14 or above
- A virtual environment
- Groq API key
- Internet access for the World Bank API
Note: Python 3.14 may cause compatibility warnings with some LangChain / Pydantic integrations.
๐ฆ Installation
1. Clone the repository
git clone <your-repo-url>
cd WorldBank_MCP
2. Create and activate a virtual environment
Windows PowerShell
python -m venv venv
venv\Scripts\activate
macOS / Linux
python -m venv venv
source venv/bin/activate
3. Install dependencies
pip install -r requirements.txt
๐ Environment Variables
Create a .env file in the project root:
groq_model=your_groq_model
groq_api_key=your_groq_api_key_here
If your client uses additional provider keys or tracing variables, add them there as well.
โถ๏ธ Running the Project
Run the agent client
python client/agent_client.py
You can then ask questions like:
โข What World Bank reports exist on climate resilience in Kenya?
โข Find documents about education financing in Sub-Saharan Africa published between 2019 and 2022.
โข What document types does the World Bank publish most frequently?
โข Find the most recent economic sector work on Indonesia.
โข List all languages the World Bank publishes documents in.
โข Find project documents related to water and sanitation in Brazil.
๐ฌ Example Queries & Responses
Sample-01
Ask a question or Type 'exit' to quit: What World Bank reports exist on climate resilience in Kenya?
=== Response ===
The World Bank has several reports on climate resilience in Kenya, including:
1. Kenya - Coastal Region Water Security And Climate Resilience Project : Environmental Assessment
2. Kenya - Coastal Region Water Security and Climate Resilience Project : Resettlement Plan (Vol. 4 of 4) : Resettlement Action Plan 2 for Mwache Multipurpose Dam Project
3. Kenya - Climate Smart Agriculture Project : Environmental Assessment (Vol. 6 of 8) : Environmental and Social Impact Assessment Report for the Rehabilitation of Kamola Water Pan, East Yimbo Location Bondo Sub County in Siaya County
4. Kenya - Climate Smart Agriculture Project : Environmental Assessment (Vol. 4 of 8) : Environmental and Social Impact Assessment Report for the Desilting and Rehabilitation of Kivuno Water Pan in Sorget-Tendeno Ward, Kipkelion East Sub-County
5. Kenya - Climate Smart Agriculture Project : Environmental Assessment (Vol. 3 of 8) : Environmental and Social Impact Assessment Report for the Kimana Livestock Sale Yard in Kimana Ward, Kajiado South Sub-County, Kajiado County
6. Kenya - Climate Smart Agriculture Project : Environmental Assessment (Vol. 1 of 2) : Environmental Management Plan for Aekumi Rock Catchment, Tharaka Nithi County
7. Kenya - Nairobi Metropolitan Services Improvement Project : resettlement plan : Final abbreviated resettlement action plan report in the construction of Kangundo-Kenol, Kenol-Koma, Katumani-Mombasa road, and rehabilitation of Kenol-Machakos Town-Katumani Link Road Project
8. Kenya - Nairobi Metropolitan Services Improvement Project : Environmental Assessment (Vol. 27 of 41) : Environmental and social impact assessment report for construction of Karandini market
These reports are available on the World Bank website and provide information on the environmental and social impacts of various projects in Kenya, including climate resilience initiatives.
Sample-02
Ask a question or Type 'exit' to quit: Find documents about education financing in Sub-Saharan Africa published between 2019 and 2022
=== Response ===
The search results did not provide any documents about education financing in Sub-Saharan Africa published between 2019 and 2022. However, the following documents were found:
1. Project Information Document - Adolescent Girls Initiative for Learning and Empowerment - P170664 (2020)
2. Concept Project Information Document (PID) - Ten Million Adolescent Girls Project - P170664 (2019)
3. Additional Financing Environmental and Social Review Summary (ESRS) - Nigeria COVID-19 Preparedness and Response Project Additional Financing - P177076 (2021)
4. Concept Stage Program Information Document (PID) - Edo Economic Transformation Program for Results - P169921 (2019)
5. Concept Project Information Document (PID) - Sustainable Procurement, Environmental and Social Standards Enhancement Project (SPESSE) - P169405 (2019)
6. Project Information Document - Sustainable Procurement, Environmental and Social Standards Enhancement Project (SPESSE) - P169405 (2019)
These documents may not be directly related to education financing in Sub-Saharan Africa, but they may provide some relevant information or context.
Sample-03
Ask a question or Type 'exit' to quit: What document types does the World Bank publish most frequently?
=== Response ===
The World Bank publishes most frequently the following document types:
1. Procurement Plan
2. Implementation Status and Results Report
3. Auditing Document
4. Working Paper
5. Brief
6. Agreement
7. Environmental Assessment
8. Report
9. Project Information Document
10. Announcement
These are the top 10 most frequent document types published by the World Bank, based on the data provided by the get_facets tool.
Sample-04
Ask a question or Type 'exit' to quit: Find the most recent economic sector work on Indonesia.
=== Response ===
The most recent economic sector work on Indonesia is the "Indonesia - Tourism Development Project : environmental assessment : Kerangka pengelolaan lingkungan dan sosial" document, with an ID of 29890113, published on January 19, 2018. The document discusses the development objective of the Tourism Development Program Project for Indonesia, which is to improve tourism-relevant road quality and basic services accessibility, strengthen local economy linkages to tourism, and promote private investment in three tourism destinations in Indonesia. The document also highlights some of the negative impacts and mitigation measures related to the project.
๐ How the Agent Works
- The user asks a question in natural language.
- The LangGraph/LangChain agent analyzes the query.
- The agent selects an MCP tool exposed by the server.
- The MCP server validates inputs and calls the corresponding tool.
- The tool queries the World Bank API.
- The result is returned to the agent.
- The agent converts the tool output into a final user-friendly response.
๐ฎ Future Enhancements
- Add result ranking / summarization
- Add citation-aware final answers
- Support multiple MCP servers in one client
- Add Streamlit or FastAPI UI
- Add local caching for repeated document queries
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.