MCP Servers

Advanced TTS MCP Server

Provides high-quality text-to-speech synthesis with 10 natural voices, emotion control, and dynamic pacing for professional applications requiring expressive speech output.

README

Advanced TTS MCP Server

A high-quality, feature-rich Text-to-Speech MCP server with native TypeScript implementation. Designed for professional applications requiring natural, expressive speech synthesis with advanced controls and zero external dependencies.

✨ Features

🎯 Advanced Voice Control

10 High-Quality Voices - Male and female voices with distinct personalities
Emotion Control - Neutral, happy, excited, calm, serious, casual, confident
Dynamic Pacing - Natural, conversational, presentation, tutorial, narrative modes
Speed & Volume - Precise control from 0.25x to 3.0x speed, 0.1x to 2.0x volume

🚀 Professional Capabilities

Streaming Audio - Real-time synthesis and playback
Batch Processing - Handle multiple text segments efficiently
Multiple Formats - WAV, MP3, FLAC, OGG output support
Natural Speech Enhancement - Automatic pause insertion and emotion markers
Queue Management - Handle multiple concurrent requests

🔧 MCP Integration

6 Powerful Tools - Complete synthesis, batch processing, voice management
2 Rich Resources - Voice capabilities and usage examples
Real-time Status - Track processing progress and manage requests
File Management - Save, list, and organize audio outputs

🚀 Quick Start

Option 1: Deploy to Smithery.ai (Recommended)

🎯 One-Click Deployment to Smithery Platform

Deploy Now: Visit Smithery.ai and import this repository
Configure: Set your preferred voice and speech settings
Use Instantly: Access via Claude Desktop or any MCP-compatible client

Benefits:

✅ Zero setup required
✅ Automatic scaling and updates
✅ No model downloads needed
✅ Enterprise-grade hosting

📋 Full Smithery Deployment Guide →

Option 2: Local Installation

Prerequisites:

Node.js 18+

Installation:

Clone the repository

git clone https://github.com/samihalawa/advanced-tts-mcp.git
cd advanced-tts-mcp

Install dependencies

npm install

Configure Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "advanced-tts": {
      "command": "node",
      "args": ["dist/index.js"],
      "cwd": "/path/to/advanced-tts-mcp"
    }
  }
}

Start using!

# Build TypeScript
npm run build

# Start server
npm start

Restart Claude Desktop and start synthesizing with natural, expressive voices.

🎙️ Available Voices

Voice ID	Name	Gender	Description
`af_heart`	Heart	Female	Warm, friendly voice (default)
`af_sky`	Sky	Female	Clear, bright voice
`af_bella`	Bella	Female	Elegant, sophisticated voice
`af_sarah`	Sarah	Female	Professional, confident voice
`af_nicole`	Nicole	Female	Gentle, soothing voice
`am_adam`	Adam	Male	Strong, authoritative voice
`am_michael`	Michael	Male	Friendly, approachable voice
`bf_emma`	Emma	Female	Young, energetic voice
`bf_isabella`	Isabella	Female	Mature, expressive voice
`bm_lewis`	Lewis	Male	Deep, resonant voice

📚 Usage Examples

Basic Synthesis

# Simple text-to-speech
await synthesize_speech(
    text="Hello! Welcome to Advanced TTS.",
    voice_id="af_heart"
)

Emotional Expression

# Excited announcement
await synthesize_speech(
    text="This is amazing news! You're going to love this new feature!",
    voice_id="af_heart",
    emotion="excited",
    pacing="conversational",
    speed=1.1
)

Professional Presentation

# Tutorial narration
await synthesize_speech(
    text="Step one: Open your browser. Step two: Navigate to the website.",
    voice_id="am_adam", 
    emotion="calm",
    pacing="tutorial",
    speed=0.9
)

Batch Processing

# Multiple segments with pauses
await batch_synthesize(
    segments=[
        "Welcome to our presentation.",
        "Today we'll cover three main topics.", 
        "Let's begin with the first topic."
    ],
    voice_id="af_sarah",
    emotion="confident",
    pacing="presentation",
    merge_output=True,
    segment_pause=1.0,
    save_file=True
)

🛠️ Available Tools

`synthesize_speech`

Convert text to natural speech with full control over voice characteristics.

Parameters:

text - Text to synthesize (max 10,000 chars)
voice_id - Voice selection (see table above)
speed - Speech rate (0.25-3.0)
emotion - Voice emotion (neutral, happy, excited, calm, serious, casual, confident)
pacing - Speech style (natural, conversational, presentation, tutorial, narrative, fast, slow)
volume - Audio volume (0.1-2.0)
output_format - File format (wav, mp3, flac, ogg)
save_file - Save to file (boolean)
filename - Custom filename

`batch_synthesize`

Process multiple text segments efficiently with optional merging.

Parameters:

segments - List of text segments
merge_output - Combine into single file
segment_pause - Pause between segments (0.0-5.0s)
All synthesis parameters from above

`get_voices`

Retrieve complete voice information and capabilities.

`get_status`

Check processing status for synthesis requests.

`cancel_request`

Cancel active synthesis operations.

`list_output_files`

Browse saved audio files with metadata.

🎛️ Voice Controls

Emotions

Neutral - Standard, professional tone
Happy - Upbeat, cheerful expression
Excited - Enthusiastic, energetic delivery
Calm - Relaxed, soothing tone
Serious - Formal, authoritative delivery
Casual - Relaxed, conversational style
Confident - Assured, professional tone

Pacing Styles

Natural - Balanced, human-like rhythm
Conversational - Casual discussion pace
Presentation - Professional speaking rhythm
Tutorial - Educational, clear delivery
Narrative - Storytelling pace
Fast - Quick delivery (1.2x base speed)
Slow - Deliberate delivery (0.8x base speed)

🎵 Audio Formats

Format	Quality	Use Case
WAV	Uncompressed	Highest quality, editing
MP3	Compressed	Web, streaming, sharing
FLAC	Lossless	Archival, high-quality storage
OGG	Compressed	Open source alternative

🔧 Configuration

Environment Variables

# Model paths (optional)
KOKORO_MODEL_PATH=./kokoro-v1.0.onnx
KOKORO_VOICES_PATH=./voices-v1.0.bin

# Output settings
TTS_OUTPUT_DIR=./audio_output
TTS_MAX_QUEUE_SIZE=100

# Audio settings  
TTS_DEFAULT_VOICE=af_heart
TTS_ENABLE_STREAMING=true

Server Configuration

config = ServerConfig(
    model_path="./kokoro-v1.0.onnx",
    voices_path="./voices-v1.0.bin", 
    output_dir="./audio_output",
    max_queue_size=100,
    enable_streaming=True,
    default_voice="af_heart"
)

🏗️ Architecture

├── src/advanced_tts/
│   ├── __init__.py          # Package initialization
│   ├── server.py            # MCP server implementation  
│   ├── engine.py            # Kokoro TTS engine wrapper
│   ├── models.py            # Data models and validation
│   └── utils.py             # Utility functions
├── pyproject.toml           # Project configuration
├── README.md               # Documentation
└── LICENSE                 # MIT License

🤝 Contributing

Contributions welcome! Areas for improvement:

Additional voice models
Real-time streaming synthesis
Advanced audio effects
Multi-language support
Performance optimizations

📄 License

MIT License - see LICENSE for details.

🙏 Acknowledgments

Kokoro TTS - High-quality neural voice synthesis
MCP Protocol - Seamless AI model integration
FastMCP - Efficient server framework

Developed by Sami Halawa

Transform your text into natural, expressive speech with Advanced TTS MCP Server.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured