STT2TTS MCP
Local-first speech-to-text and text-to-speech MCP server. Hot-swappable engines via config.yaml — no code changes, no API keys required.
README
STT2TTS MCP Server
Local-first speech-to-text and text-to-speech MCP server. Hot-swappable engines via config.yaml — no code changes, no API keys required.
┌──────────────┐ stdio ┌──────────────────┐
│ MCP client │ ◀────────────▶ │ stt2tts-mcp │
│ │ │ ├─ STT engine │ ──▶ faster-whisper
│ │ │ └─ TTS engine │ ──▶ piper / kokoro / coqui
└──────────────┘ └──────────────────┘
│
▼
config.yaml (hot-reload)
Why
Replaces whisper-mcp. Works offline, ships with five STT and six TTS engines, switches per-task via config.
Install
pip install stt2tts-mcp
# Add the engines you actually use:
pip install stt2tts-mcp[stt-faster-whisper] # local STT
pip install stt2tts-mcp[tts-piper] # local TTS (~50MB voices)
# Register with your MCP client (consult your client's docs for the exact
# config file location — most use mcp_config.json or a per-client equivalent):
{
"mcp": {
"stt2tts": {
"type": "local",
"command": ["stt2tts-mcp"],
"enabled": true
}
}
}
Engines
| STT | Size | License | Best for |
|---|---|---|---|
| faster-whisper | 39M – 2.9 GB | MIT | English, INT8 CPU, fastest |
| sherpa-onnx | 39M – large | Apache 2.0 | Multilingual |
| OpenAI API | cloud | Proprietary | Highest accuracy, needs key |
| Ollama | varies | MIT | Local LLM integration |
| LMStudio | varies | MIT | Local model server |
| TTS | Voice size | License | Best for |
|---|---|---|---|
| Piper | 20 – 50 MB | Apache 2.0 | Smallest, 10-20× realtime |
| Kokoro-82M | ~330 MB | Apache 2.0 | Quality/size ratio |
| Coqui XTTS | ~1.5 GB | MPL 2.0 | Voice cloning, needs GPU |
| OpenAI API | cloud | Proprietary | All voices, needs key |
| Ollama | varies | MIT | LLM-based voices |
| LMStudio | varies | MIT | Local model server |
Configure
config.yaml:
stt:
engine: faster_whisper # sherpa_onnx | openai_api | ollama | lmstudio
enabled: true
params:
model_size: base.en # tiny.en | base.en | small.en | medium.en
device: cpu # cpu | cuda
tts:
engine: piper # kokoro | coqui | openai_api | ollama | lmstudio
enabled: true
params:
voice: en_US-lessac-medium
model_dir: ~/.cache/piper
Reload without restart by calling the reload_config MCP tool.
MCP Tools
| Tool | What it does |
|---|---|
transcribe(audio_path, language?) |
Audio file → text |
speak(text, output_path, voice?) |
Text → WAV file |
list_stt_models |
Available STT models |
list_tts_voices |
Available TTS voices |
reload_config |
Re-read config.yaml, rebuild engines |
health_check |
Engine status |
All formats ffmpeg supports (wav, mp3, ogg, flac, m4a) are accepted; STT input is auto-converted to 16 kHz mono.
Develop
git clone https://github.com/your-org/stt2tts-mcp
cd stt2tts-mcp
pip install -e ".[all]"
python -m stt2tts_mcp.server
License
Apache 2.0
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.