STT2TTS MCP

STT2TTS MCP

Local-first speech-to-text and text-to-speech MCP server. Hot-swappable engines via config.yaml — no code changes, no API keys required.

Category
Visit Server

README

STT2TTS MCP Server

Local-first speech-to-text and text-to-speech MCP server. Hot-swappable engines via config.yaml — no code changes, no API keys required.

┌──────────────┐     stdio      ┌──────────────────┐
│ MCP client   │ ◀────────────▶ │ stt2tts-mcp      │
│              │                │  ├─ STT engine   │ ──▶ faster-whisper
│              │                │  └─ TTS engine   │ ──▶ piper / kokoro / coqui
└──────────────┘                └──────────────────┘
                                       │
                                       ▼
                              config.yaml (hot-reload)

Why

Replaces whisper-mcp. Works offline, ships with five STT and six TTS engines, switches per-task via config.

Install

pip install stt2tts-mcp

# Add the engines you actually use:
pip install stt2tts-mcp[stt-faster-whisper]   # local STT
pip install stt2tts-mcp[tts-piper]            # local TTS (~50MB voices)

# Register with your MCP client (consult your client's docs for the exact
# config file location — most use mcp_config.json or a per-client equivalent):
{
  "mcp": {
    "stt2tts": {
      "type": "local",
      "command": ["stt2tts-mcp"],
      "enabled": true
    }
  }
}

Engines

STT Size License Best for
faster-whisper 39M – 2.9 GB MIT English, INT8 CPU, fastest
sherpa-onnx 39M – large Apache 2.0 Multilingual
OpenAI API cloud Proprietary Highest accuracy, needs key
Ollama varies MIT Local LLM integration
LMStudio varies MIT Local model server
TTS Voice size License Best for
Piper 20 – 50 MB Apache 2.0 Smallest, 10-20× realtime
Kokoro-82M ~330 MB Apache 2.0 Quality/size ratio
Coqui XTTS ~1.5 GB MPL 2.0 Voice cloning, needs GPU
OpenAI API cloud Proprietary All voices, needs key
Ollama varies MIT LLM-based voices
LMStudio varies MIT Local model server

Configure

config.yaml:

stt:
  engine: faster_whisper   # sherpa_onnx | openai_api | ollama | lmstudio
  enabled: true
  params:
    model_size: base.en     # tiny.en | base.en | small.en | medium.en
    device: cpu             # cpu | cuda

tts:
  engine: piper             # kokoro | coqui | openai_api | ollama | lmstudio
  enabled: true
  params:
    voice: en_US-lessac-medium
    model_dir: ~/.cache/piper

Reload without restart by calling the reload_config MCP tool.

MCP Tools

Tool What it does
transcribe(audio_path, language?) Audio file → text
speak(text, output_path, voice?) Text → WAV file
list_stt_models Available STT models
list_tts_voices Available TTS voices
reload_config Re-read config.yaml, rebuild engines
health_check Engine status

All formats ffmpeg supports (wav, mp3, ogg, flac, m4a) are accepted; STT input is auto-converted to 16 kHz mono.

Develop

git clone https://github.com/your-org/stt2tts-mcp
cd stt2tts-mcp
pip install -e ".[all]"
python -m stt2tts_mcp.server

License

Apache 2.0

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured