MCP Servers

MCP Local LLM Server

A FastMCP server that exposes a locally-hosted llama.cpp LLM as MCP tools, plus utilities for weather, news, web fetching, file I/O, and more. Enables MCP-compatible clients to use a local GGUF model for text generation without external APIs.

README

MCP Local LLM Server

A FastMCP server that exposes a locally-hosted llama.cpp language model as MCP tools, plus a suite of utility tools for weather, news, web fetching, file I/O, stock data, summarization, and more.

Overview

This server lets any MCP-compatible client (e.g. Claude Desktop, Cursor) use a local GGUF model for text generation and chat, without sending data to an external API. The model is served by a llama-server subprocess; the MCP server communicates with it over a local HTTP port.

src/
├── llm_server.py        # Entry point, argument parsing, server startup
├── model.py             # llama-server lifecycle, token generation
├── upload.py            # POST /upload endpoint for PDF and Markdown file uploads
├── resources.py         # MCP resource: llm://info
└── tools/
    ├── generate.py      # Tool: generate
    ├── chat.py          # Tool: chat
    ├── weather.py       # Tool: get_weather
    ├── date_time.py     # Tool: get_datetime
    ├── fetch_url.py     # Tool: fetch_url
    ├── news.py          # Tool: news_headlines
    ├── read_pdf.py      # Tool: read_pdf
    ├── read_markdown.py # Tool: read_markdown
    ├── create_file.py   # Tool: create_file
    ├── list_directory.py# Tool: list_directory
    ├── stock_price.py   # Tool: get_stock_price
    ├── summarize.py     # Tool: summarize_text
    ├── agent.py         # Tool: run_agent (autonomous ReAct agent)
    ├── explain_code.py  # Tool: explain_code (coding tutor)
    ├── review_code.py   # Tool: review_code  (coding tutor)
    ├── coding_tutor.py  # Tool: coding_tutor (orchestrating tutor agent)
    ├── transcribe_audio.py # Tool: transcribe_audio
    ├── text_to_speech.py   # Tool: text_to_speech
    ├── word_definition.py  # Tool: define_word
    └── random_joke.py      # Tool: get_random_joke

Requirements

Python 3.10+
A llama-server binary from llama.cpp
A GGUF model file (e.g. models/Qwen2.5-7B-Instruct-Q4_K_M.gguf)
CUDA-capable GPU recommended for large models

Install dependencies:

pip install -r requirements.txt

On Windows, IANA timezone data is not bundled with Python. Install it for the get_datetime tool to support non-UTC timezones:

pip install tzdata

The transcribe_audio tool requires faster-whisper, which is listed in requirements.txt but has an optional CUDA-accelerated variant. For GPU inference, install the matching PyTorch CUDA build first:

# CPU-only (default)
pip install faster-whisper

# GPU (CUDA 12)
pip install faster-whisper
pip install torch --index-url https://download.pytorch.org/whl/cu128

Configuration

Some tools require API keys. Create a .env file in the project root (already gitignored):

NEWSAPI_KEY=your_key_here

The server loads this file automatically on startup.

Variable	Required by	Where to get it
`NEWSAPI_KEY`	`news_headlines`	newsapi.org — free tier available

All other tools work without any API key.

Starting the Server

stdio transport (for MCP clients like Claude Desktop)

python src/llm_server.py \
  --model models/Qwen2.5-7B-Instruct-Q4_K_M.gguf \
  --llama-server /path/to/llama-server

HTTP transport (for network clients or testing with curl)

python src/llm_server.py \
  --model models/Qwen2.5-7B-Instruct-Q4_K_M.gguf \
  --llama-server /path/to/llama-server \
  --transport http \
  --port 5174

CLI flags

Flag	Default	Description
`--model`	(required)	Path to a `.gguf` file, or a directory containing one
`--llama-server`	(required)	Path to the `llama-server` executable
`--transport`	`stdio`	`stdio` or `http`
`--host`	`0.0.0.0`	Host to bind for HTTP transport (use `127.0.0.1` for localhost only)
`--port`	`5174`	MCP server port (HTTP transport only)
`--server-port`	`8080`	Port for the internal llama-server backend
`--gpu-layers`	`-1`	Layers to offload to GPU; `-1` = all
`--context-size`	`16384`	Total context window in tokens (prompt + output combined)

HTTP Endpoints

These endpoints are only available when using --transport http.

`POST /upload`

Upload a PDF or Markdown file to the server and receive an upload_id to pass to run_agent.

Supported types: .pdf, .md, .markdown

Request: multipart/form-data with a single field named file.

Response:

{
  "upload_id": "3f8a1c...",
  "filename": "report.pdf",
  "size": 84210
}

Uploaded files are stored in uploads/ at the project root and deleted when the server shuts down.

Example:

curl -X POST http://localhost:5174/upload \
  -F "file=@/path/to/report.pdf"

curl -X POST http://localhost:5174/upload \
  -F "file=@/path/to/notes.md"

Tools

`generate`

Generate text from a raw prompt. The input prompt is not included in the returned text.

Parameters

Parameter	Type	Default	Description
`prompt`	`string`	(required)	Input text to continue
`max_new_tokens`	`int`	`512`	Maximum tokens to generate
`temperature`	`float`	`0.7`	Sampling temperature; `0` = greedy (deterministic)
`top_p`	`float`	`0.9`	Nucleus-sampling cumulative probability cutoff
`top_k`	`int`	`0`	Top-k vocabulary filter; `0` = disabled
`repetition_penalty`	`float`	`1.0`	Penalty for repeating tokens; `1.0` = no penalty
`stop_sequences`	`list[string]`	`null`	Strings that halt generation when produced
`seed`	`int`	`null`	RNG seed for reproducible outputs

Returns: The generated text as a plain string.

Example

{
  "prompt": "The capital of France is",
  "max_new_tokens": 50,
  "temperature": 0
}

`chat`

Chat with the local LLM using a conversation history.

Parameters

Parameter	Type	Default	Description
`messages`	`list[{"role": string, "content": string}]`	(required)	Conversation history. Valid roles: `"system"`, `"user"`, `"assistant"`
`max_new_tokens`	`int`	`512`	Maximum tokens to generate
`temperature`	`float`	`0.7`	Sampling temperature; `0` = greedy (deterministic)
`top_p`	`float`	`0.9`	Nucleus-sampling cumulative probability cutoff
`top_k`	`int`	`0`	Top-k vocabulary filter; `0` = disabled
`repetition_penalty`	`float`	`1.0`	Penalty for repeating tokens; `1.0` = no penalty
`stop_sequences`	`list[string]`	`null`	Strings that halt generation when produced
`seed`	`int`	`null`	RNG seed for reproducible outputs

Returns: The assistant's reply as plain text.

Example

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user",   "content": "What is the Eiffel Tower?"}
  ],
  "temperature": 0.7,
  "max_new_tokens": 256
}

`get_weather`

Fetch current weather for any location using the free Open-Meteo API. No API key required.

Parameters

Parameter	Type	Default	Description
`location`	`string`	(required)	City name or region (e.g. `"London"`, `"New York"`, `"Tokyo"`)
`units`	`string`	`"metric"`	`"metric"` (°C, km/h) or `"imperial"` (°F, mph)

Returns: Current conditions, temperature, humidity, and wind speed.

Example output

Weather in London, England, United Kingdom:
  Conditions:  Partly cloudy
  Temperature: 12.3°C
  Humidity:    74%
  Wind:        18.5 km/h

`get_datetime`

Return the current date and time for any IANA timezone. No API key required.

Parameters

Parameter	Type	Default	Description
`timezone`	`string`	`"UTC"`	IANA timezone name (e.g. `"America/New_York"`, `"Europe/London"`, `"Asia/Tokyo"`)

Returns: A formatted date/time string, e.g. "2025-03-01 14:30:00 EST (UTC-0500)".

Windows note: Non-UTC timezones require pip install tzdata.

Example

{"timezone": "America/Chicago"}

`fetch_url`

Fetch the content of any URL and return it as plain text. HTML pages are stripped of tags; JSON and plain-text responses are returned as-is. No API key required.

Parameters

Parameter	Type	Default	Description
`url`	`string`	(required)	URL to fetch (must start with `http://` or `https://`)
`max_chars`	`int`	`4000`	Maximum characters to return before truncating

Returns: Extracted page text, truncated to max_chars if needed. Returns a descriptive error string on connection failure rather than raising an exception.

Example

{
  "url": "https://en.wikipedia.org/wiki/Python_(programming_language)",
  "max_chars": 2000
}

`news_headlines`

Fetch the latest news headlines, optionally filtered by topic. Requires a free NewsAPI key.

Parameters

Parameter	Type	Default	Description
`topic`	`string`	`""`	Keyword(s) to filter by (e.g. `"AI"`, `"climate change"`). Leave blank for general top headlines
`country`	`string`	`"us"`	2-letter country code used when `topic` is blank (e.g. `us`, `gb`, `au`, `de`)
`max_results`	`int`	`5`	Number of headlines to return (1–10)

Returns: Numbered list of headlines with source, publication date, and URL.

Example

{"topic": "artificial intelligence", "max_results": 3}

Example output

1. [BBC News] OpenAI releases new model
   Published: 2025-03-01
   https://bbc.co.uk/...

2. [Reuters] ...

`read_pdf`

Extract and return the text content of a PDF file, organised by page. Image-only (scanned) PDFs return a clear message rather than empty output.

Parameters

Parameter	Type	Default	Description
`file_path`	`string`	(required)	Absolute or relative path to the PDF file
`max_chars`	`int`	`8000`	Maximum characters to return before truncating

Returns: Extracted text organised by page, truncated to max_chars if needed.

Example

{"file_path": "/home/user/documents/report.pdf", "max_chars": 10000}

`read_markdown`

Read and return the contents of a Markdown file.

Parameters

Parameter	Type	Default	Description
`file_path`	`string`	(required)	Absolute or relative path to the `.md` or `.markdown` file
`max_chars`	`int`	`8000`	Maximum characters to return before truncating

Returns: The file's text content, truncated to max_chars if needed.

Example

{"file_path": "/home/user/documents/notes.md"}

`create_file`

Create a new file with the given name and content. Rejects path traversal attempts and validates the filename before writing.

Parameters

Parameter	Type	Default	Description
`file_name`	`string`	(required)	Basename of the file to create (no path separators allowed)
`content`	`string`	(required)	Text content to write to the file
`directory`	`string`	`null`	Subdirectory to create the file in (relative to the server's working directory). Created if it doesn't exist
`encoding`	`string`	`"utf-8"`	Character encoding for the file
`overwrite`	`bool`	`false`	If `true`, overwrite an existing file at the same path

Returns: A JSON object with keys:

success — true if the file was created
file_path — absolute path to the created file (null on failure)
error — error message if success is false
message — human-readable status

Examples

{"file_name": "notes.md", "content": "# My Notes\n"}

{
  "file_name": "config.json",
  "content": "{\"debug\": true}",
  "directory": "src",
  "overwrite": true
}

`list_directory`

List the files and directories at a given path, with optional glob filtering and recursive traversal.

Parameters

Parameter	Type	Default	Description
`path`	`string`	(required)	Absolute or relative path to the directory to list
`pattern`	`string`	`"*"`	Glob pattern to filter results (e.g. `".py"`, `"data_"`, `"*/.json"`)
`recursive`	`bool`	`false`	If `true`, traverse all subdirectories
`include_hidden`	`bool`	`false`	If `true`, include files and folders whose names start with `"."`
`max_results`	`int`	`200`	Maximum number of entries to return

Returns: A formatted listing showing [DIR] and [FILE] entries with file sizes, plus a summary count. Truncation is noted if max_results is reached.

Examples

{"path": "/home/user/projects"}

{
  "path": "/home/user/projects/mcp-server",
  "pattern": "*.py",
  "recursive": true
}

Example output

Directory: /home/user/projects/mcp-server/src

  [DIR]  tools/
  [FILE] llm_server.py  (4 KB)
  [FILE] model.py       (7 KB)
  [FILE] resources.py   (1 KB)
  [FILE] upload.py      (2 KB)

4 item(s) shown

`get_stock_price`

Get the current stock price and key market data for a ticker symbol via the Yahoo Finance API. No API key required.

Parameters

Parameter	Type	Default	Description
`ticker`	`string`	(required)	Stock ticker symbol (e.g. `"AAPL"`, `"MSFT"`, `"TSLA"`, `"BTC-USD"`)

Returns: Current price, daily change, day range, 52-week range, volume, and market cap.

Example

{"ticker": "NVDA"}

Example output

NVIDIA Corp (NVDA)
  Price:       875.40 USD
  Change:      +12.30 (+1.43%)
  Prev close:  863.10 USD
  Day range:   860.00 – 879.50 USD
  52-wk range: 410.00 – 974.00 USD
  Volume:      42,381,200
  Market cap:  2.16T USD

Tip: Crypto pairs are also supported (e.g. "BTC-USD", "ETH-USD").

`summarize_text`

Summarize a block of text using the local LLM. Long texts are automatically split into chunks, each summarized independently, then merged into a single coherent summary.

Parameters

Parameter	Type	Default	Description
`text`	`string`	(required)	The text to summarize. Can be arbitrarily long
`focus`	`string`	`""`	Optional instruction to guide the summary (e.g. `"key risks"`, `"action items"`, `"technical details"`). Leave blank for a general summary
`max_length`	`int`	`200`	Approximate maximum length of the summary in tokens. Controls verbosity

Returns: A concise summary of the input text.

Examples

{"text": "... (long article) ..."}

{
  "text": "... (meeting transcript) ...",
  "focus": "action items",
  "max_length": 150
}

{
  "text": "... (technical document) ...",
  "focus": "key risks",
  "max_length": 300
}

`transcribe_audio`

Transcribe an audio file to text using a local Whisper model via faster-whisper. Runs entirely on-device — no API key or internet connection required. Models are downloaded automatically on first use and cached locally.

Supported formats: mp3, mp4, wav, flac, ogg, m4a, webm, and most ffmpeg-readable formats.

Parameters

Parameter	Type	Default	Description
`audio_path`	`string`	(required)	Absolute or relative path to the audio file
`model_size`	`string`	`"base"`	Whisper model to use: `"tiny"` (~150 MB), `"base"` (~290 MB), `"small"` (~970 MB), `"medium"` (~3.1 GB), `"large-v3"` (~6.2 GB)
`language`	`string`	`null`	ISO-639-1 language code to force (e.g. `"en"`, `"fr"`). Leave `null` to auto-detect
`device`	`string`	`"auto"`	Inference device. `"auto"` selects CUDA if available, else CPU. Explicit: `"cuda"`, `"cpu"`
`compute_type`	`string`	`"auto"`	Precision. `"auto"` uses `float16` on GPU and `int8` on CPU. Explicit: `"float16"`, `"int8"`, `"float32"`

Returns: A JSON object with keys:

success — true if transcription succeeded
text — the transcribed text (null on failure)
language — detected or forced language code
duration_seconds — audio duration in seconds
error — error message if success is false

Examples

{"audio_path": "/home/user/recordings/meeting.mp3"}

{
  "audio_path": "/home/user/recordings/lecture.wav",
  "model_size": "small",
  "language": "en"
}

Tip: Use "tiny" or "base" for fast transcription of short clips. Use "small" or higher for better accuracy on noisy audio or non-English speech.

`run_agent`

Run an autonomous ReAct agent powered by the local LLM. The agent reasons step by step and calls tools as many times as needed before producing a final answer.

How it works

User goal
   ↓
LLM decides: call a tool or answer?
   ↓ (if tool)
Tool executes → result fed back to LLM
   ↓
LLM decides again … (repeats up to max_steps)
   ↓ (when done)
FINAL answer returned

Parameters

Parameter	Type	Default	Description
`goal`	`string`	(required)	The task or question for the agent to solve
`max_steps`	`int`	`10`	Maximum tool-call iterations before stopping
`max_new_tokens`	`int`	`4096`	Token ceiling per LLM call. Mainly affects the length of the final answer
`max_history_pairs`	`int`	`4`	Recent assistant+tool rounds to keep in full; older rounds are summarised
`summary_strategy`	`string`	`"deterministic"`	`"deterministic"` — fast rule-based bullet points. `"llm"` — model-generated prose (adds an extra generation call)
`upload_id`	`string`	`""`	ID returned by `POST /upload`. The file's contents are injected into the agent's context before the loop starts

Returns: The agent's final answer as plain text.

If you see Agent stopped after N steps without a FINAL answer, the agent exhausted its iterations. Either increase max_steps or simplify the goal.

Tools available to the agent

Tool	Description
`get_weather`	Fetch current weather for any city
`get_datetime`	Get the current date and time in any timezone
`fetch_url`	Fetch and extract text from any URL
`news_headlines`	Fetch the latest news headlines by topic
`read_pdf`	Extract text from a PDF file at a given path
`read_markdown`	Read the contents of a Markdown file at a given path
`get_stock_price`	Get the current stock price and market data for a ticker symbol
`summarize_text`	Summarize a block of text using the local LLM
`create_file`	Create a new file with the given name and content
`list_directory`	List files and directories at a given path
`transcribe_audio`	Transcribe an audio file to text using a local Whisper model

Examples

{"goal": "What should I wear in Paris today?"}

{"goal": "Compare the weather in London and Tokyo, then tell me which city is warmer."}

{"goal": "Summarise this document", "upload_id": "3f8a1c..."}

{
  "goal": "What are the top AI news stories today?",
  "max_steps": 5
}

`text_to_speech`

Convert text to an MP3 audio file using Google Text-to-Speech (gTTS). Requires an internet connection. No API key required.

Dependency: Install gTTS before using this tool:
pip install gtts

Parameters

Parameter	Type	Default	Description
`text`	`string`	(required)	The text to convert to speech
`output_path`	`string`	(required)	File path where the MP3 will be saved (e.g. `"output/speech.mp3"`). Parent directories are created automatically
`lang`	`string`	`"en"`	BCP-47 language code (e.g. `"en"`, `"fr"`, `"es"`, `"de"`, `"ja"`)
`slow`	`bool`	`false`	If `true`, speech is generated at a slower rate

Returns: A JSON object with keys:

success — true if the file was saved successfully
output_path — absolute path to the saved MP3 (null on failure)
error — error message if success is false

Examples

{
  "text": "Hello, world!",
  "output_path": "output/hello.mp3"
}

{
  "text": "Bonjour le monde",
  "output_path": "output/bonjour.mp3",
  "lang": "fr",
  "slow": true
}

`define_word`

Look up the definition, phonetics, synonyms, and antonyms of an English word using the free Dictionary API. No API key required.

Parameters

Parameter	Type	Default	Description
`word`	`string`	(required)	The English word to look up (e.g. `"ephemeral"`, `"serendipity"`)

Returns: A JSON object with keys:

success — true if the word was found
word — the normalised word that was looked up
results — list of meanings, each containing:
- phonetic — IPA phonetic spelling (may be null)
- part_of_speech — e.g. "noun", "verb", "adjective"
- definitions — up to 3 definitions, each with definition, example (may be null), synonyms (up to 5), antonyms (up to 5)
- synonyms — up to 5 synonyms for this part of speech
- antonyms — up to 5 antonyms for this part of speech
error — error message if success is false

Example

{"word": "ephemeral"}

Example output (abbreviated)

{
  "success": true,
  "word": "ephemeral",
  "results": [
    {
      "phonetic": "/ɪˈfɛm(ə)r(ə)l/",
      "part_of_speech": "adjective",
      "definitions": [
        {
          "definition": "Lasting for a very short time.",
          "example": "fashions are ephemeral",
          "synonyms": ["transitory", "transient", "fleeting"],
          "antonyms": ["permanent", "eternal"]
        }
      ],
      "synonyms": ["transitory", "transient"],
      "antonyms": ["permanent"]
    }
  ],
  "error": null
}

`get_random_joke`

Fetch a random joke from the free JokeAPI. No API key required.

Parameters

Parameter	Type	Default	Description
`category`	`string`	`"Any"`	Joke category: `"Any"`, `"Programming"`, `"Misc"`, `"Dark"`, `"Pun"`, `"Spooky"`, `"Christmas"`
`joke_type`	`string`	`"any"`	Format filter: `"any"`, `"single"` (one-liner), `"twopart"` (setup + punchline)
`safe_mode`	`bool`	`true`	If `true`, excludes explicit, racist, sexist, and religious jokes

Returns: A JSON object with keys:

success — true if a joke was returned
category — the category the joke belongs to
type — "single" or "twopart"
joke — the joke text; two-part jokes are formatted as "setup\n\n— delivery"
error — error message if success is false

Note: The "Dark" category is unavailable when safe_mode is true.

Examples

{"category": "Programming"}

{
  "category": "Pun",
  "joke_type": "twopart",
  "safe_mode": true
}

Coding Tutor Tools

Three tools that turn the server into an interactive programming tutor. The high-level entry point is coding_tutor; the two supporting tools (explain_code, review_code) can also be called directly.

`coding_tutor`

An autonomous ReAct agent specialised for teaching. It reasons step by step, calling explain_code, review_code, and fetch_url as needed, then produces a pedagogical response.

Parameters

Parameter	Type	Default	Description
`question`	`string`	(required)	Your coding question, code snippet, or error message
`max_steps`	`int`	`8`	Maximum tool-call iterations before stopping
`max_new_tokens`	`int`	`1024`	Token ceiling per LLM call
`max_history_pairs`	`int`	`4`	Recent assistant+tool rounds to keep before older ones are summarised
`summary_strategy`	`string`	`"deterministic"`	`"deterministic"` (fast) or `"llm"` (prose, slower)

Returns: A teaching response as plain text.

Tools available to the tutor

Tool	Description
`explain_code`	Explain a code snippet at the learner's skill level
`review_code`	Review code for bugs, style, security, or performance issues
`fetch_url`	Fetch documentation or a GitHub link referenced by the learner

Examples

{"question": "Why does my list comprehension give the wrong result?\n\nresult = [x * 2 for x in [1, 2, 3] if x > 1]"}

{"question": "Explain the difference between a shallow copy and a deep copy in Python, with examples."}

Tip: The tutor infers skill level from your question — use plain language for beginner explanations, technical terminology for advanced ones.

`explain_code`

Explain a code snippet using the local LLM, tailored to the learner's skill level.

Parameters

Parameter	Type	Default	Description
`code`	`string`	(required)	The source code to explain (capped at 6 000 chars)
`language`	`string`	`"python"`	Programming language of the snippet
`level`	`string`	`"beginner"`	Explanation depth: `"beginner"`, `"intermediate"`, or `"advanced"`
`max_new_tokens`	`int`	`1024`	Maximum tokens for the explanation

Returns: A plain-text explanation of the code.

Example

{
  "code": "result = {k: v for k, v in zip(keys, values)}",
  "language": "python",
  "level": "beginner"
}

`review_code`

Review a code snippet for issues. Outputs a structured report: overall impression, numbered issues with severity, positives, and a top recommendation.

Parameters

Parameter	Type	Default	Description
`code`	`string`	(required)	The source code to review (capped at 6 000 chars)
`language`	`string`	`"python"`	Programming language of the snippet
`focus`	`string`	`"general"`	Review focus: `"general"`, `"security"`, `"performance"`, or `"style"`
`max_new_tokens`	`int`	`768`	Maximum tokens for the review

Returns: A structured review with four sections: Overall Impression, Issues Found, Positives, and Top Recommendation.

Example

{
  "code": "def get_user(id):\n    return db.execute(f'SELECT * FROM users WHERE id={id}')",
  "language": "python",
  "focus": "security"
}

Resources

`llm://info`

Returns metadata about the currently loaded model (path, context size, GPU layers).

Configuring with Claude Desktop

Add the server to your claude_desktop_config.json:

{
  "mcpServers": {
    "local-llm": {
      "command": "python",
      "args": [
        "src/llm_server.py",
        "--model", "models/Qwen2.5-7B-Instruct-Q4_K_M.gguf",
        "--llama-server", "/path/to/llama-server"
      ],
      "cwd": "/absolute/path/to/mcp-server"
    }
  }
}

Notes

The model is loaded once at startup via llama-server and held in memory for the lifetime of the server.
GPU offloading is controlled by --gpu-layers; -1 offloads all layers.
--context-size sets the total token budget shared between prompt and generated output. Increase it if you experience truncation on long responses.
The HTTP transport binds to 0.0.0.0 by default, making it accessible from other machines on the network. Use --host 127.0.0.1 to restrict to localhost. CORS is enabled for all origins — restrict allow_origins before exposing to untrusted networks.
Uploaded files (POST /upload) are stored in uploads/ at the project root and automatically deleted on server shutdown.

Running a standalone llama-server (e.g. for opencode)

llama-server \
  --model /path/to/model.gguf \
  --port 8000 \
  --host 127.0.0.1 \
  --n-gpu-layers -1 \
  --ctx-size 16384 \
  --no-mmap

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured