Internship Scout & Quality of Life MCP Server

Internship Scout & Quality of Life MCP Server

Integrates Eurostat quality-of-life metrics and real-time job searching to help users find international internships in high-ranking European cities. It enables ranking cities based on personalized criteria like safety or transport and retrieves structured internship listings via the Tavily API.

Category
Visit Server

README

Project Report: AI-Powered Internship Scout & European City Quality of Life MCP Server

1. Executive Summary

This project automates the search for an international internship by cross-referencing real-time job availability with official quality-of-life (QoL) metrics. By bridging the gap between employment opportunities and living standards, the system allows users to find roles in cities that align with their personal values.

2. System Architecture

The system utilizes a containerized architecture managed via Docker, ensuring a consistent environment for the Model MCP server and the n8n automation engine.

  • MCP Server: A custom Python application built with FastAPI (exposed via FastMCP) that serves as the
  • n8n: Orchestrates the logic flow, LLM prompting, and final delivery.
  • External APIs:
    • Eurostat: Official European statistical data for city-level indicators (https://ec.europa.eu/eurostat/web/cities/database).
    • Tavily API: An AI-optimized search engine used to scrape real-time internship listings.
    • SimpleMaps: Provides the geographic backbone (lat/long and population) for city data (https://simplemaps.com/data/world-cities).

3. MCP Server Logic & Data Management

alt text

3.1 Eurostat Data Manager

The EurostatDataManager class is responsible for the lifecycle of city data, from ingestion to ranking.

Ranking Methodology

The system calculates a weighted normalized score for each city. Two critical logic layers ensure the ranking is accurate:

  1. Polarity Correction: To ensure a high score always represents "better" quality, the system identifies "negative" indicators (e.g., crime, pollution, or long commute times). For these, the normalized value is inverted.

  2. Normalization: Values are scaled according to their unit types (percentages, minutes, or currency) to allow for a fair comparison between disparate metrics like "Average Rent" and "Air Quality."

  3. Fallback Logic: Recognizing that data for smaller municipalities can be sparse, the server implements a recursive fallback. If no statistical indicators are available for a requested city, it defaults to a population-based ranking to ensure the user still receives valid geographic suggestions.

3.2 Tavily API Manager

The TavilyManager serves as the bridge to the live web. Unlike standard search engines, Tavily is optimized for LLM context, returning clean, high-relevance content.

4. Workflow Implementation

The n8n workflow follows an 8-stage pipeline to transform a user's vague preferences into a structured report:

  1. User Input Capture: Collects target technologies, countries of interest, and QoL preferences.

  2. Schema Alignment: Retrieves valid Eurostat criteria via get_criteria_list().

  3. LLM Mapping: Translates natural language (e.g., "I hate traffic") into database-friendly codes.

  4. Multi-City Ranking: Identifies the top 2 cities per country that match the weighted criteria.

  5. Targeted Web Scraping: Calls search_offers() for each identified city.

  6. Information Extraction: An LLM parses raw HTML/text into a structured schema: Company, Position, Skills, Summary, Location, and URL.

  7. Data Consolidation: Aggregates the top 10 offers into a CSV and a Discord-ready summary.

  8. Final Delivery: Disseminates the findings via Discord webhook.

alt text

5. Setup and Installation

To get this project up and running, follow these steps:

Prerequisites

  • Python 3.12

  • Docker & Docker Composedeployment.

  • uv

5.1. Clone the Repository

First, clone the project repository to your local machine:

git clone git@github.com:c-r-lewis/n8n-international-internship-scout.git

5.2. Environment Configuration

Create a .env file in the root of the project directory. This file will hold your environment variables, including API keys.

TAVILY_API_KEY="your_tavily_api_key_here"
POSTGRES_PASSWORD="your_n8n_postgres_password_here"

Note: Obtain your Tavily API key from the Tavily website.

5.3. Install Dependencies

Using uv, install the project dependencies:

uv sync

5.4. Prepare Data

The application relies on Eurostat data and a world cities dataset.

  • Eurostat Data: The eurostat_data_manager.py script is responsible for populating the eurostat_data/cities.db SQLite database. To ingest the data, you can use mcp inspector and call the download_eurostat_data function.

  • World Cities Data: Ensure ressources/worldcities.csv is present. This file provides base geographical information for cities.

6. Running the Application

You can run the application using Docker Compose for a containerized setup or directly using uvicorn for local development.

6.1. With Docker Compose (Recommended)

This method ensures all services run in isolated containers.

docker compose up --build

The application should then be accessible at http://localhost:8000 (or the port defined in your docker-compose.yml).

6.2. Locally (for Development)

Ensure you have activated your Python virtual environment (if using one) and installed dependencies (uv sync).

uvicorn app:app --reload --host 0.0.0.0 --port 8000

The --reload flag enables auto-reloading on code changes, which is useful for development. The application will be available at http://localhost:8000.

6.3. Setting credentials for n8n workflow

To be able to run the n8n workflow in the n8n interface you will have to set a Mistral cloud API secret and a Discord Webhook secret.

7. Example Results

For a search conducted in the United Kingdom and Switzerland targeting "Computer Science" with a focus on low crime and good transport, the system successfully identified key roles in cities like Basel and Southampton, providing a direct excel export and an automated Discord notification for immediate review.

8. Challenges & Solutions

  • Multi-dimensional Data Parsing: Eurostat’s JSON-STAT format is difficult to flatten. The solution involved developing a custom _index_to_coords mapper to navigate the nested dimension sizes and IDs programmatically.

  • Payload Management: Discord’s 2000-character limit posed a risk for the final summary. I implemented a "round-robin" selection logic in the final Code node to ensure the summary included a representative sample from all top towns without exceeding the character cap.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
E2B

E2B

Using MCP to run code via e2b.

Official
Featured