IMDB MCP
Enables semantic and similarity search across IMDB movie data using vector embeddings and PostgreSQL with pgvector, supporting traditional filters and hybrid search.
README
IMDB MCP
Model Context Protocol (MCP) server for movie data with semantic vector search using embeddings and PostgreSQL with pgvector.
Overview
Provides semantic search, similarity matching, and traditional filtering across IMDB movie data:
- Semantic Search: Find movies by meaning using embeddings
- Similarity Search: Get similar movies based on descriptions
- Hybrid Search: Combine semantic and keyword matching
- Traditional Filters: Genre, country, title, ratings
Setup
Prerequisites
- Python 3.12+
- PostgreSQL 12+ with pgvector extension
- GCP Secret Manager (for credentials)
- ~400MB for embedding model download
Installation
uv sync
Environment
Set required environment variable:
export GCP_PROJECT_ID=your-gcp-project-id
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
GCP Secret Manager must contain:
db-host: PostgreSQL hostdb-port: PostgreSQL portdb-name: Database namedb-user: Database userdb-password: Database passworddb-admin-password: Admin password
Usage - Database
Run the ETL pipeline to set up and seed the database:
python extract.py # Extract from source
python transform.py # Generate embeddings
python load.py # Load into PostgreSQL with pgvector
Place the CSV file in the data/ folder: data/imdb_movies.csv
Usage - MCP
Start the MCP server:
python -m mcp_server
Server runs on port 3000 with tools for:
semantic_search: Search by description meaningsimilarity_search: Find similar movieshybrid_search: Combined semantic and keyword searchget_movie_by_id: Retrieve movie detailssearch_movies: Title-based search- Additional filtering and stats tools
Tests
Run manually via GitHub Actions or locally:
uv run pytest tests/ -v --cov=. --cov-report=term-missing
Future
My next step for this project would be to use a GCP solution for the postgres database and connect the MCP to this rather than a local pgsql database.
Deployment
Currently this project is meant for local use only, but I have added workflows for deployment to GCP, with small modification to the mcp server to read from bigquery or cloud SQL instead of a local postgres database.
Contributing
- Write tests for new features
- Run test suite locally
- Push to feature branch
- Manual test trigger in Actions
- Deploy on approval
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.