MCP Research Tools
Local-first MCP server that gives Claude Code web search, page reading, video transcription, and image analysis — without paid API keys. Runs SearXNG + whisper.cpp natively on Apple Silicon for zero-cost, low-latency research workflows.
README
MCP Research Tools
Local-first MCP server that gives Claude Code web search, page reading, video transcription, and image analysis — without paid API keys. Runs SearXNG + whisper.cpp natively on Apple Silicon for zero-cost, low-latency research workflows.
Why this exists
If you run AI agents that search the web as part of their workflow, the API bills add up fast. A single search API call costs fractions of a cent, but when your agent is working around the clock — researching, fetching pages, pulling transcripts — those fractions compound into real money. This server eliminates that cost entirely by running everything locally: SearXNG aggregates 70+ search engines with no API key, trafilatura extracts clean text from any page, and whisper.cpp transcribes audio on-device using Metal acceleration. Plug it into Claude Code via MCP and your agent can research freely without a meter running.
Architecture
Docker Native (Mac mini)
┌──────────────┐ ┌─────────────────────────┐
│ SearXNG │◄── JSON API ──────│ MCP Server (FastMCP) │
│ :8080 │ │ ├─ searxng_search │
│ Redis │ │ ├─ web_fetch │
└──────────────┘ │ ├─ process_video │
│ └─ analyze_image │
└────────┬────────────────┘
stdio │ streamable-http
Claude Code / Agent Harness
SearXNG + Redis run in Docker. The MCP server and media tools (ffmpeg, yt-dlp, whisper-cli) run natively so whisper gets Apple Silicon Metal acceleration.
Quick Start
git clone https://github.com/hippogriff-ai/mcp-research-tools.git
cd mcp-research-tools
chmod +x install.sh
./install.sh
The install script:
- Installs
ffmpeg,yt-dlp,whisper-cppvia Homebrew - Downloads the Whisper
smallmodel (~465MB) - Starts SearXNG + Redis via Docker Compose
- Creates a Python venv and installs the project
Tools
searxng_search — Web Search
Queries local SearXNG instance (70+ search engines aggregated, zero API cost).
searxng - searxng_search(query="your search", max_results=10, engines="duckduckgo,brave")
Returns: { query, results: [{ title, url, snippet, engine, score }], count }
web_fetch — Page Fetch
Downloads a web page and extracts clean readable text via trafilatura. No DOM bloat.
searxng - web_fetch(url="https://example.com", extract_text=true)
Returns: { status_code, url, content_type, content_text, title }
process_video — Video Processing
Downloads YouTube/TikTok videos, trims to 120s, extracts audio + keyframes every 3s, transcribes audio via whisper-cli.
searxng - process_video(url="https://youtube.com/watch?v=...")
Returns: { transcript, frames: ["/tmp/.../frame_0001.jpg", ...], audio_path, duration_seconds }
analyze_image — Image Analysis
Fetches images from URLs (cached locally) or validates local paths for Claude vision reasoning.
searxng - analyze_image(source="https://example.com/photo.jpg")
Returns: { path, exists, content_type }
Usage
Claude Code (stdio)
Add to your project .mcp.json:
{
"mcpServers": {
"searxng": {
"command": "/path/to/mcp-research-tools/venv/bin/python",
"args": ["-m", "mcp_research_tools.server"]
}
}
}
Or add globally in ~/.claude.json under the mcpServers key for every session.
Remote Agent Harness (streamable-http)
source venv/bin/activate
MCP_TRANSPORT=streamable-http MCP_HOST=0.0.0.0 MCP_PORT=9000 \
python -m mcp_research_tools.server
Connect from client at http://<mac-mini-ip>:9000/mcp.
Configuration
All settings via environment variables:
| Variable | Default | Description |
|---|---|---|
SEARXNG_URL |
http://localhost:8080 |
SearXNG instance URL |
MCP_TRANSPORT |
stdio |
Transport: stdio or streamable-http |
MCP_HOST |
127.0.0.1 |
Bind host for HTTP transport |
MCP_PORT |
9000 |
Port for HTTP transport |
MAX_VIDEO_SECONDS |
120 |
Max video duration to process |
FRAME_EVERY_SECONDS |
3 |
Keyframe extraction interval |
WHISPER_MODEL |
small |
Whisper model (tiny/base/small/medium/large) |
WHISPER_CPP_BIN |
whisper-cli |
Whisper binary name |
WHISPER_MODEL_PATH |
~/.cache/whisper-cpp/ggml-small.bin |
Path to whisper model |
MAX_FETCH_SIZE_MB |
10 |
Max page download size |
FETCH_TIMEOUT_SECONDS |
30 |
Page fetch timeout |
Requirements
- macOS with Apple Silicon (M1/M2/M3/M4)
- Docker Desktop
- Homebrew
- Python 3.13+
Development
source venv/bin/activate
# Run tests
pytest tests/ -v
# Lint
ruff check src/ tests/
Managing SearXNG
# Start
docker compose up -d
# Stop
docker compose down
# Logs
docker compose logs -f searxng
# Test JSON API
curl 'http://localhost:8080/search?q=test&format=json' | python3 -m json.tool
Acknowledgements
This project is built on top of SearXNG, a free internet metasearch engine that aggregates results from 70+ search services. SearXNG is what makes zero-cost, private web search possible — massive thanks to their maintainers and contributors.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.