mcp-coqui-tts

mcp-coqui-tts

Provides text-to-speech synthesis and voice cloning using Coqui TTS, enabling natural speech output from text and audio samples.

Category
Visit Server

README

MCP Coqui TTS Server

A Model Context Protocol (MCP) server that provides text-to-speech synthesis capabilities using Coqui TTS, including voice cloning support.

Features

  • Text-to-Speech Synthesis: Convert text to natural-sounding speech
  • Multiple Models: Support for various TTS models and languages
  • Voice Cloning: Clone voices from audio samples using XTTS models
  • Long Text Support: Automatic chunking for longer texts
  • Customizable Output: Control speed, speaker, and language settings

Prerequisites

Before using this MCP server, you need to install Coqui TTS:

pip install TTS

For voice cloning and concatenation features, you'll also need ffmpeg:

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt-get install ffmpeg

# Windows (using chocolatey)
choco install ffmpeg

Installation

From npm

npm install -g @s.lfr/mcp-coqui-tts

From Source

git clone https://github.com/yourusername/mcp-coqui-tts.git
cd mcp-coqui-tts
npm install
npm link

Usage

With Claude Desktop

Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "coqui-tts": {
      "command": "npx",
      "args": ["@s.lfr/mcp-coqui-tts"]
    }
  }
}

Or if installed from source:

{
  "mcpServers": {
    "coqui-tts": {
      "command": "node",
      "args": ["/path/to/mcp-coqui-tts/index.js"]
    }
  }
}

Available Tools

1. speak

Convert text to speech with customizable parameters.

Parameters:

  • text (required): The text to convert to speech
  • model: TTS model to use (default: "tts_models/en/ljspeech/tacotron2-DDC")
  • output_path: Where to save the audio file
  • speaker_idx: Speaker index for multi-speaker models
  • language_idx: Language index for multi-language models
  • speed: Speed factor (1.0 is normal speed)

Example:

{
  "text": "Hello, this is a test of the text to speech system.",
  "model": "tts_models/en/ljspeech/tacotron2-DDC",
  "output_path": "/tmp/output.wav",
  "speed": 1.2
}

2. list_models

List all available TTS models.

Parameters: None

3. synthesize_long_text

Synthesize longer texts with automatic chunking and concatenation.

Parameters:

  • text (required): The long text to convert
  • model: TTS model to use
  • output_path: Where to save the final audio
  • chunk_size: Maximum characters per chunk (default: 500)

Example:

{
  "text": "This is a very long text that will be automatically split into chunks...",
  "output_path": "/tmp/long_speech.wav",
  "chunk_size": 500
}

4. clone_voice

Clone a voice from an audio sample (requires XTTS model).

Parameters:

  • text (required): Text to speak in the cloned voice
  • reference_audio (required): Path to reference audio for voice cloning
  • output_path: Where to save the output
  • language: Language code (default: "en")

Example:

{
  "text": "This will be spoken in the cloned voice.",
  "reference_audio": "/path/to/sample.wav",
  "output_path": "/tmp/cloned_voice.wav",
  "language": "en"
}

Popular TTS Models

English Models

  • tts_models/en/ljspeech/tacotron2-DDC - High quality English TTS
  • tts_models/en/ljspeech/fast_pitch - Fast English TTS
  • tts_models/en/vctk/vits - Multi-speaker English (110 speakers)

Multilingual Models

  • tts_models/multilingual/multi-dataset/xtts_v2 - Supports voice cloning
  • tts_models/multilingual/multi-dataset/your_tts - Multilingual with voice cloning

Other Languages

Run list_models to see all available models for different languages.

Deployment on Smithery

To deploy this MCP server on Smithery:

  1. Fork this repository
  2. Connect your GitHub account to Smithery
  3. Create a new MCP server on Smithery
  4. Select this repository
  5. Deploy

The server will be automatically available for use with any MCP-compatible client.

Development

Running Locally

npm start

Testing

You can test the server using the MCP inspector:

npx @modelcontextprotocol/inspector node index.js

Troubleshooting

Common Issues

  1. "tts: command not found"

    • Make sure Coqui TTS is installed: pip install TTS
    • Ensure Python/pip binaries are in your PATH
  2. "ffmpeg: command not found"

    • Install ffmpeg for your operating system (see Prerequisites)
  3. Model download fails

    • First run may take time as models are downloaded
    • Check internet connection
    • Ensure sufficient disk space (~1-5GB per model)
  4. Voice cloning not working

    • Requires XTTS v2 model
    • Reference audio should be clear, 5-10 seconds long
    • WAV format recommended for reference audio

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT

Acknowledgments

Support

For issues and questions, please open an issue on GitHub.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured