MCP Servers

Zonos TTS MCP Server

Facilitates direct speech generation using Claude for multiple languages and emotions, integrating with a Zonos TTS setup via the Model Context Protocol.

PhialsBasement

Speech Processing

Visit Server

Tools

speak_response

README

Zonos MCP Integration

A Model Context Protocol integration for Zonos TTS, allowing Claude to generate speech directly.

Setup

Installing via Smithery

To install Zonos TTS Integration for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @PhialsBasement/zonos-tts-mcp --client claude

Manual installation

Make sure you have Zonos running with our API implementation (PhialsBasement/zonos-api)
Install dependencies:

npm install @modelcontextprotocol/sdk axios

Configure PulseAudio access:

# Your pulse audio should be properly configured for audio playback
# The MCP server will automatically try to connect to your pulse server

Build the MCP server:

npm run build
# This will create the dist folder with the compiled server

Add to Claude's config file: Edit your Claude config file (usually in ~/.config/claude/config.json) and add this to the mcpServers section:

"zonos-tts": {
  "command": "node",
  "args": [
    "/path/to/your/zonos-mcp/dist/server.js"
  ]
}

Replace /path/to/your/zonos-mcp with the actual path where you installed the MCP server.

Using with Claude

Once configured, Claude automatically knows how to use the speak_response tool:

speak_response(
    text="Your text here",
    language="en-us",  # optional, defaults to en-us
    emotion="happy"    # optional: "neutral", "happy", "sad", "angry"
)

Features

Text-to-speech through Claude
Multiple emotions support
Multi-language support
Proper audio playback through PulseAudio

Requirements

Node.js
PulseAudio setup
Running instance of Zonos API (PhialsBasement/zonos-api)
Working audio output device

Notes

Make sure both the Zonos API server and this MCP server are running
Audio playback requires proper PulseAudio configuration

Recommended Servers

mcp-server-youtube-transcript

A Model Context Protocol server that enables retrieval of transcripts from YouTube videos. This server provides direct access to video captions and subtitles through a simple interface.

Featured

JavaScript

Voice Recorder MCP Server

Enables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.

Local

Python

Speech MCP

A Goose MCP extension providing voice interaction with modern audio visualization, allowing users to communicate with Goose through speech rather than text.

Local

Python

Home Assistant MCP

Expose all Home Assistant voice intents through a Model Context Protocol Server allowing home control.

Local

Python

Kokoro TTS MCP Server

Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.

Local

Python

mcp-hfspace

Use HuggingFace Spaces directly from Claude. Use Open Source Image Generation, Chat, Vision tasks and more. Supports Image, Audio and text uploads/downloads.

Local

TypeScript

MS-Lucidia-Voice-Gateway-MCP

A server providing text-to-speech and speech-to-text functionalities using Windows' native speech services without external dependencies.

Local

JavaScript

Say MCP Server

Enables text-to-speech functionality on macOS using the say command, offering extensive control over speech parameters like voice, rate, volume, and pitch for a customizable auditory experience.

Local

JavaScript

Ollama MCP Server

Enables seamless integration between Ollama's local LLM models and MCP-compatible applications, supporting model management and chat interactions.

Local

TypeScript

ElevenLabs Text-to-Speech MCP

Contribute to georgi-io/jessica development by creating an account on GitHub.

Local

Python