mcp-coqui-tts
Provides text-to-speech synthesis and voice cloning using Coqui TTS, enabling natural speech output from text and audio samples.
README
MCP Coqui TTS Server
A Model Context Protocol (MCP) server that provides text-to-speech synthesis capabilities using Coqui TTS, including voice cloning support.
Features
- Text-to-Speech Synthesis: Convert text to natural-sounding speech
- Multiple Models: Support for various TTS models and languages
- Voice Cloning: Clone voices from audio samples using XTTS models
- Long Text Support: Automatic chunking for longer texts
- Customizable Output: Control speed, speaker, and language settings
Prerequisites
Before using this MCP server, you need to install Coqui TTS:
pip install TTS
For voice cloning and concatenation features, you'll also need ffmpeg:
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt-get install ffmpeg
# Windows (using chocolatey)
choco install ffmpeg
Installation
From npm
npm install -g @s.lfr/mcp-coqui-tts
From Source
git clone https://github.com/yourusername/mcp-coqui-tts.git
cd mcp-coqui-tts
npm install
npm link
Usage
With Claude Desktop
Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"coqui-tts": {
"command": "npx",
"args": ["@s.lfr/mcp-coqui-tts"]
}
}
}
Or if installed from source:
{
"mcpServers": {
"coqui-tts": {
"command": "node",
"args": ["/path/to/mcp-coqui-tts/index.js"]
}
}
}
Available Tools
1. speak
Convert text to speech with customizable parameters.
Parameters:
text(required): The text to convert to speechmodel: TTS model to use (default: "tts_models/en/ljspeech/tacotron2-DDC")output_path: Where to save the audio filespeaker_idx: Speaker index for multi-speaker modelslanguage_idx: Language index for multi-language modelsspeed: Speed factor (1.0 is normal speed)
Example:
{
"text": "Hello, this is a test of the text to speech system.",
"model": "tts_models/en/ljspeech/tacotron2-DDC",
"output_path": "/tmp/output.wav",
"speed": 1.2
}
2. list_models
List all available TTS models.
Parameters: None
3. synthesize_long_text
Synthesize longer texts with automatic chunking and concatenation.
Parameters:
text(required): The long text to convertmodel: TTS model to useoutput_path: Where to save the final audiochunk_size: Maximum characters per chunk (default: 500)
Example:
{
"text": "This is a very long text that will be automatically split into chunks...",
"output_path": "/tmp/long_speech.wav",
"chunk_size": 500
}
4. clone_voice
Clone a voice from an audio sample (requires XTTS model).
Parameters:
text(required): Text to speak in the cloned voicereference_audio(required): Path to reference audio for voice cloningoutput_path: Where to save the outputlanguage: Language code (default: "en")
Example:
{
"text": "This will be spoken in the cloned voice.",
"reference_audio": "/path/to/sample.wav",
"output_path": "/tmp/cloned_voice.wav",
"language": "en"
}
Popular TTS Models
English Models
tts_models/en/ljspeech/tacotron2-DDC- High quality English TTStts_models/en/ljspeech/fast_pitch- Fast English TTStts_models/en/vctk/vits- Multi-speaker English (110 speakers)
Multilingual Models
tts_models/multilingual/multi-dataset/xtts_v2- Supports voice cloningtts_models/multilingual/multi-dataset/your_tts- Multilingual with voice cloning
Other Languages
Run list_models to see all available models for different languages.
Deployment on Smithery
To deploy this MCP server on Smithery:
- Fork this repository
- Connect your GitHub account to Smithery
- Create a new MCP server on Smithery
- Select this repository
- Deploy
The server will be automatically available for use with any MCP-compatible client.
Development
Running Locally
npm start
Testing
You can test the server using the MCP inspector:
npx @modelcontextprotocol/inspector node index.js
Troubleshooting
Common Issues
-
"tts: command not found"
- Make sure Coqui TTS is installed:
pip install TTS - Ensure Python/pip binaries are in your PATH
- Make sure Coqui TTS is installed:
-
"ffmpeg: command not found"
- Install ffmpeg for your operating system (see Prerequisites)
-
Model download fails
- First run may take time as models are downloaded
- Check internet connection
- Ensure sufficient disk space (~1-5GB per model)
-
Voice cloning not working
- Requires XTTS v2 model
- Reference audio should be clear, 5-10 seconds long
- WAV format recommended for reference audio
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT
Acknowledgments
- Coqui TTS for the excellent TTS library
- Model Context Protocol for the MCP SDK
Support
For issues and questions, please open an issue on GitHub.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.