Ukraine War MCP
Provides text analysis tools including TF-IDF document search and text profiling (sentiment, readability, keywords) for analyzing text corpora through structured MCP interfaces.
README
2025-autumn-mcp
Project Background
This project is about learning how to turn basic data science skills into real, usable services. Rather than running code in isolation, you’ll package text analysis tools into a Model Context Protocol (MCP) server that can be consumed by any MCP-aware client, including modern AI assistants. Along the way you’ll learn how to design structured inputs and outputs (schemas), containerize and run services with Docker, and expose your work in a way that others — whether researchers, policymakers, or fellow students — could immediately integrate into their own workflows. The goal is not to build the most advanced NLP system, but to see how small, well-defined analytics can be made reusable, composable, and sharable across disciplines.
Goals
This sprint focuses on learning the Model Context Protocol by building a text-analysis MCP server.
What You'll Build:
- An MCP server with baseline text-analysis tools (group work)
- Your own custom MCP tool relevant to your field (individual work on a feature branch)
Using Python, Pydantic schemas, and FastMCP, you'll gain experience with natural language processing techniques (TF-IDF, sentiment analysis, readability metrics), structured data exchange, and service-oriented design.
Deliverables:
- Working baseline MCP server with
corpus_answerandtext_profiletools - Your custom tool on a feature branch with tests
- Demo showing your tool in action
- Documentation explaining your tool's domain application
Project Structure
mcp/
├── data/
│ └── corpus/ # Your text corpus (.txt files)
│ ├── climate_policy.txt
│ ├── urban_planning.txt
│ ├── ai_ethics.txt
│ └── public_health.txt
├── notebooks/
│ └── MCP_Introduction.ipynb # Interactive tutorial
├── src/
│ ├── utils/ # Utility code from week 1
│ └── mcp_server/ # MCP server implementation
│ ├── __init__.py
│ ├── server.py # Main FastMCP server
│ ├── schemas.py # Pydantic data models
│ ├── config/
│ │ ├── __init__.py
│ │ └── settings.py # Configuration settings
│ └── tools/
│ ├── __init__.py
│ ├── corpus_answer.py # Document search tool
│ └── text_profile.py # Text analysis tool
├── tests/
│ └── mcp_server/ # Tests for MCP tools
│ ├── test_corpus_answer.py
│ └── test_text_profile.py
├── pyproject.toml # Python dependencies
└── README.md
Introduction & Setup
Getting Started:
- Review the demonstration notebook:
notebooks/MCP_Introduction.ipynb - Read about MCP:
- Skim this page on Pydantic
- Complete the Quick Start below to set up your environment
Phase 1: Group Work - Baseline MCP Server
Part 1: Schemas & Text Analysis Foundations
Objectives (Complete together as a group):
- Understand Pydantic schemas and data validation
- Learn TF-IDF basics for document search
- Set up a shared corpus
- Understand MCP tool design patterns
Tasks:
-
Complete the notebook
notebooks/MCP_Introduction.ipynb- Build your first MCP tool
- Work with TF-IDF for document search
- Define Pydantic schemas
- Register tools with FastMCP
-
Create a shared corpus
- Add 3-5
.txtfiles todata/corpus/ - Sample documents provided: climate policy, urban planning, AI ethics, public health
- Choose documents that demonstrate the tools' capabilities
- Add 3-5
-
Review the provided code structure in
src/mcp_server/schemas.py- Pydantic models for tool inputs/outputstools/corpus_answer.py- Document search skeletontools/text_profile.py- Text analytics skeletonserver.py- Main MCP server application
Deliverable: Completed notebook and shared corpus
Part 2: Baseline Tool Implementation
Objectives (Implement together as a group):
- Implement the
corpus_answertool with TF-IDF search - Implement the
text_profiletool with text analytics - Test the baseline implementation
Tasks:
-
Implement
corpus_answertool (src/mcp_server/tools/corpus_answer.py)Complete the TODOs in:
_load_corpus()- Load .txt files from the corpus directory_ensure_index()- Build TF-IDF index from documents_synthesize_answer()- Create concise answer snippetscorpus_answer()- Main search and ranking logic
Key steps:
- Load all .txt files from
data/corpus/ - Build TF-IDF vectorizer with appropriate parameters
- Transform query and compute cosine similarity
- Return top 3-5 results with snippets and scores
-
Implement
text_profiletool (src/mcp_server/tools/text_profile.py)Complete the TODOs in:
_read_doc()- Read document by ID from corpus_tokenize()- Extract words from text_flesch_reading_ease()- Calculate readability score_top_terms()- Extract keywords using TF-IDFtext_profile()- Compute all text features
Features to calculate:
- Character and word counts
- Type-token ratio (lexical diversity)
- Flesch Reading Ease score
- VADER sentiment analysis
- Top n-grams and keywords
-
Test your tools
# Run tests make test # Test specific tool uv run pytest tests/mcp_server/test_corpus_answer.py -v -
Debug and refine
- Use logging to debug
- Test with different queries and documents
- Ensure all tests pass
Deliverable: Working baseline server with corpus_answer and text_profile tools
Phase 2: Individual Work - Custom Tool Development
Creating Your Own MCP Tool
Now that you understand MCP fundamentals, each student will create their own custom tool on a feature branch.
Objectives (Individual work):
- Apply MCP concepts to your own field or interests
- Design and implement a non-trivial tool
- Write tests for your tool
- Demonstrate domain-specific application
Tasks:
-
Create your feature branch
git checkout -b student/my-custom-tool -
Design your tool
Choose a tool relevant to your field or interests. Examples:
- Policy analysis: Extract policy recommendations from documents
- Data science: Statistical analysis or data transformation tool
- Research: Literature review summarization or citation extraction
- Education: Readability adaptation or concept explanation
- Healthcare: Medical terminology extraction or symptom checking
- Environmental: Climate data analysis or carbon footprint calculation
Your tool should:
- Be non-trivial (more complex than a simple calculation)
- Have a clear use case in your domain
- Use Pydantic schemas for inputs/outputs
- Return structured, useful data
-
Implement your tool
Create
src/mcp_server/tools/my_tool_name.py:from pydantic import BaseModel, Field class MyToolInput(BaseModel): """Input schema for my tool.""" # Define your inputs class MyToolOutput(BaseModel): """Output schema for my tool.""" # Define your outputs def my_tool(input: MyToolInput) -> MyToolOutput: """Your tool implementation.""" # Your logic here -
Register your tool in
src/mcp_server/server.py:from mcp_server.tools.my_tool_name import my_tool, MyToolInput, MyToolOutput @mcp.tool def my_tool_tool(input: MyToolInput) -> MyToolOutput: """My custom tool description.""" return my_tool(input) -
Write tests in
tests/mcp_server/test_my_tool.py:def test_my_tool(): result = my_tool(MyToolInput(...)) assert result.some_field == expected_value -
Test and document
- Run
make testto verify tests pass - Run
uv run python tests/manual_server_test.pyto test end-to-end - Document your tool's purpose and usage in comments
- Run
Deliverable: Working custom tool with tests on your feature branch
Demo & Presentation
Objectives:
- Demonstrate your custom tool in action
- Show how it applies MCP concepts to your domain
- Present test results
- Reflect on real-world applications
Tasks:
-
Test your server
Option A: Quick test (validate tools work)
make run-interactive uv run pytest tests/manual_server_test.py -vOption B: MCP Inspector (full protocol test)
# Terminal 1: Start server make run-interactive uv run python -m mcp_server.server # Terminal 2: Run Inspector on HOST (not in container) npx @modelcontextprotocol/inspector # Choose: STDIO transport, command: ./run_mcp_server.shSee
notebooks/MCP_Introduction.ipynbfor complete Inspector setup instructions. -
Prepare your demo presentation
Your demo should show:
- All three tools: Baseline tools (
corpus_answer,text_profile) + your custom tool - Your custom tool in depth:
- What problem it solves in your domain
- Example inputs and outputs
- How the Pydantic schemas are designed
- Test results proving it works
- Real-world application: How someone in your field would actually use this tool
- All three tools: Baseline tools (
-
Write documentation for your custom tool In your tool file or a separate doc, explain:
- What problem your tool solves
- How to use it (with examples)
- Design decisions (why this schema? why this approach?)
- Potential applications in your field
- Limitations and future improvements
-
Reflection questions (for your documentation)
- How does your tool address a real need in your domain?
- What challenges did you face in implementing it?
- How could it be extended or improved?
- How might it integrate with other tools or systems?
Final Deliverable:
- Feature branch with your custom tool
- Passing test suite
- Documentation explaining your tool and its domain application
Quick Start
Note: The corpus files are included in the repository at data/corpus/. You can modify or add to them for your project.
Option A: Using VS Code/Cursor (Recommended)
If you're using VS Code or Cursor, you can use the devcontainer:
# Prepare the devcontainer
make devcontainer
# Then in VS Code/Cursor:
# - Command Palette (Cmd/Ctrl+Shift+P)
# - Select "Dev Containers: Reopen in Container"
Option B: Using Make Commands
# Build the Docker image
make build-only
# Test that everything works
make test
Technical Expectations
Prerequisites
We use Docker, Make, uv, and Node.js as part of our curriculum. If you are unfamiliar with them, it is strongly recommended you read over the following:
Required on your HOST machine:
- Docker: An introduction to Docker
- Make: Usually pre-installed on macOS/Linux. Windows users: install via Chocolatey or use WSL
- Node.js: Required for MCP Inspector testing tool
- Install from nodejs.org (LTS version)
- Or use package manager:
brew install node(macOS),apt install nodejs npm(Ubuntu) - Verify:
node --versionshould show v18.x or higher
Inside the Docker container:
- uv: An introduction to uv - for Python package management
Container-Based Development
All code must be run inside the Docker container. This ensures consistent environments across different machines and eliminates "works on my machine" issues.
Environment Management with uv
We use uv for Python environment and package management inside the container. uv handles:
- Virtual environment creation and management (replaces venv/pyenv)
- Package installation and dependency resolution (replaces pip)
- Project dependency management via
pyproject.toml
Important: When running Python code, prefix commands with uv run to maintain the proper environment:
Usage & Testing
Running Tests
# Run all pytest tests
make test
# Run specific test file
make run-interactive
uv run pytest tests/mcp_server/test_corpus_answer.py -v
Docker & Make
We use docker and make to run our code. Common make commands:
make build-only: Build the Docker image onlymake run-interactive: Start an interactive bash session in the containermake test: Run all tests with pytestmake devcontainer: Build and prepare devcontainer for VS Code/Cursormake clean: Clean up Docker images and containers
The file Makefile contains details about the specific commands that are run when calling each make target.
Additional Resources
MCP and FastMCP
Text Analysis Libraries
Reference Implementation
- Review
notebooks/MCP_Introduction.ipynbfor interactive examples
Style
We use ruff to enforce style standards and grade code quality. This is an automated code checker that looks for specific issues in the code that need to be fixed to make it readable and consistent with common standards. ruff is run before each commit via pre-commit. If it fails, the commit will be blocked and the user will be shown what needs to be changed.
To run pre-commit inside the container:
pre-commit install
pre-commit run --all-files
You can also run ruff directly:
ruff check
ruff format
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
E2B
Using MCP to run code via e2b.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Neon Database
MCP server for interacting with Neon Management API and databases