Kokoro TTS MCP Server

Kokoro TTS MCP Server

Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.

giannisanni

Speech Processing
Visit Server

README

Kokoro TTS MCP Server

A Model Context Protocol (MCP) server that provides text-to-speech capabilities using the Kokoro TTS engine. This server exposes TTS functionality through MCP tools, making it easy to integrate speech synthesis into your applications.

Prerequisites

  • Python 3.10 or higher
  • uv package manager

Installation

  1. First, install the uv package manager:
curl -LsSf https://astral.sh/uv/install.sh | sh
  1. Clone this repository and install dependencies:
uv venv
source .venv/bin/activate  # On Windows, use: .venv\Scripts\activate
uv pip install .

Features

  • Text-to-speech synthesis with customizable voices
  • Adjustable speech speed
  • Support for saving audio to files or direct playback
  • Cross-platform audio playback support (Windows, macOS, Linux)

Usage

The server provides a single MCP tool generate_speech with the following parameters:

  • text (required): The text to convert to speech
  • voice (optional): Voice to use for synthesis (default: "af_heart")
  • speed (optional): Speech speed multiplier (default: 1.0)
  • save_path (optional): Directory to save audio files
  • play_audio (optional): Whether to play the audio immediately (default: False)

Example Usage

from mcp.client import Client

async with Client() as client:
    await client.connect("kokoro-tts")
    
    # Generate and play speech
    result = await client.call_tool(
        "generate_speech",
        {
            "text": "Hello, world!",
            "voice": "af_heart",
            "speed": 1.0,
            "play_audio": True
        }
    )

Dependencies

  • kokoro >= 0.8.4
  • mcp[cli] >= 1.3.0
  • soundfile >= 0.13.1

Platform Support

Audio playback is supported on:

  • Windows (using start)
  • macOS (using afplay)
  • Linux (using aplay)

MCP Configuration

Add the following configuration to your MCP settings file:

{
  "mcpServers": {
    "kokoro-tts": {
      "command": "/Users/giannisan/pinokio/bin/miniconda/bin/uv",
      "args": [
        "--directory",
        "/Users/giannisan/Documents/Cline/MCP/kokoro-tts-mcp",
        "run",
        "tts-mcp.py"
      ]
    }
  }
}

License

[Add your license information here]

Recommended Servers

mcp-server-youtube-transcript

mcp-server-youtube-transcript

A Model Context Protocol server that enables retrieval of transcripts from YouTube videos. This server provides direct access to video captions and subtitles through a simple interface.

Featured
JavaScript
Zonos TTS MCP Server

Zonos TTS MCP Server

Facilitates direct speech generation using Claude for multiple languages and emotions, integrating with a Zonos TTS setup via the Model Context Protocol.

Local
TypeScript
Home Assistant MCP

Home Assistant MCP

Expose all Home Assistant voice intents through a Model Context Protocol Server allowing home control.

Local
Python
MS-Lucidia-Voice-Gateway-MCP

MS-Lucidia-Voice-Gateway-MCP

A server providing text-to-speech and speech-to-text functionalities using Windows' native speech services without external dependencies.

Local
JavaScript
Say MCP Server

Say MCP Server

Enables text-to-speech functionality on macOS using the say command, offering extensive control over speech parameters like voice, rate, volume, and pitch for a customizable auditory experience.

Local
JavaScript
Ollama MCP Server

Ollama MCP Server

Enables seamless integration between Ollama's local LLM models and MCP-compatible applications, supporting model management and chat interactions.

Local
TypeScript
mcp-hfspace

mcp-hfspace

Use HuggingFace Spaces directly from Claude. Use Open Source Image Generation, Chat, Vision tasks and more. Supports Image, Audio and text uploads/downloads.

Local
TypeScript
Voice Recorder MCP Server

Voice Recorder MCP Server

Enables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.

Local
Python
Speech MCP

Speech MCP

A Goose MCP extension providing voice interaction with modern audio visualization, allowing users to communicate with Goose through speech rather than text.

Local
Python
ElevenLabs Text-to-Speech MCP

ElevenLabs Text-to-Speech MCP

Contribute to georgi-io/jessica development by creating an account on GitHub.

Local
Python