MCP Servers

ElevenLabs MCP Server

Integrates with ElevenLabs text-to-speech API.

mamertofabian

Speech Processing

Visit Server

Tools

generate_audio_simple

Generate audio from plain text using default voice settings

generate_audio_script

Generate audio from a structured script with multiple voices and actors. Accepts either: 1. Plain text string 2. JSON string with format: { "script": [ { "text": "Text to speak", "voice_id": "optional-voice-id", "actor": "optional-actor-name" }, ... ] }

delete_job

Delete a voiceover job and its associated files

get_audio_file

Get the audio file content for a specific job

list_voices

Get a list of all available ElevenLabs voices with metadata

get_voiceover_history

Get voiceover job history. Optionally specify a job ID for a specific job.

README

ElevenLabs MCP Server

A Model Context Protocol (MCP) server that integrates with ElevenLabs text-to-speech API, featuring both a server component and a sample web-based MCP Client (SvelteKit) for managing voice generation tasks.

Features

Generate audio from text using ElevenLabs API
Support for multiple voices and script parts
SQLite database for persistent history storage
Sample SvelteKit MCP Client for:
- Simple text-to-speech conversion
- Multi-part script management
- Voice history tracking and playback
- Audio file downloads

Installation

Installing via Smithery

To install ElevenLabs MCP Server for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install elevenlabs-mcp-server --client claude

Using uvx (recommended)

When using uvx, no specific installation is needed.

Add the following configuration to your MCP settings file (e.g., cline_mcp_settings.json for Claude Desktop):

{
  "mcpServers": {
    "elevenlabs": {
      "command": "uvx",
      "args": ["elevenlabs-mcp-server"],
      "env": {
        "ELEVENLABS_API_KEY": "your-api-key",
        "ELEVENLABS_VOICE_ID": "your-voice-id",
        "ELEVENLABS_MODEL_ID": "eleven_flash_v2",
        "ELEVENLABS_STABILITY": "0.5",
        "ELEVENLABS_SIMILARITY_BOOST": "0.75",
        "ELEVENLABS_STYLE": "0.1",
        "ELEVENLABS_OUTPUT_DIR": "output"
      }
    }
  }
}

Development Installation

Clone this repository
Install dependencies:
```
uv venv
```
Copy .env.example to .env and fill in your ElevenLabs credentials

{
  "mcpServers": {
    "elevenlabs": {
      "command": "uv",
      "args": [
        "--directory",
        "path/to/elevenlabs-mcp-server",
        "run",
        "elevenlabs-mcp-server"
      ],
      "env": {
        "ELEVENLABS_API_KEY": "your-api-key",
        "ELEVENLABS_VOICE_ID": "your-voice-id",
        "ELEVENLABS_MODEL_ID": "eleven_flash_v2",
        "ELEVENLABS_STABILITY": "0.5",
        "ELEVENLABS_SIMILARITY_BOOST": "0.75",
        "ELEVENLABS_STYLE": "0.1",
        "ELEVENLABS_OUTPUT_DIR": "output"
      }
    }
  }
}

Using the Sample SvelteKit MCP Client

Navigate to the web UI directory:
```
cd clients/web-ui
```
Install dependencies:
```
pnpm install
```
Copy .env.example to .env and configure as needed
Run the web UI:
```
pnpm dev
```
Open http://localhost:5174 in your browser

Available Tools

generate_audio_simple: Generate audio from plain text using default voice settings
generate_audio_script: Generate audio from a structured script with multiple voices and actors
delete_job: Delete a job by its ID
get_audio_file: Get the audio file by its ID
list_voices: List all available voices
get_voiceover_history: Get voiceover job history. Optionally specify a job ID for a specific job.

Available Resources

voiceover://history/{job_id}: Get the audio file by its ID
voiceover://voices: List all available voices

License

This project is licensed under the MIT License - see the LICENSE file for details.

Recommended Servers

mcp-server-youtube-transcript

A Model Context Protocol server that enables retrieval of transcripts from YouTube videos. This server provides direct access to video captions and subtitles through a simple interface.

Featured

JavaScript

Say MCP Server

Enables text-to-speech functionality on macOS using the say command, offering extensive control over speech parameters like voice, rate, volume, and pitch for a customizable auditory experience.

Local

JavaScript

Ollama MCP Server

Enables seamless integration between Ollama's local LLM models and MCP-compatible applications, supporting model management and chat interactions.

Local

TypeScript

mcp-hfspace

Use HuggingFace Spaces directly from Claude. Use Open Source Image Generation, Chat, Vision tasks and more. Supports Image, Audio and text uploads/downloads.

Local

TypeScript

Kokoro TTS MCP Server

Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.

Local

Python

Voice Recorder MCP Server

Enables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.

Local

Python

Speech MCP

A Goose MCP extension providing voice interaction with modern audio visualization, allowing users to communicate with Goose through speech rather than text.

Local

Python

Home Assistant MCP

Expose all Home Assistant voice intents through a Model Context Protocol Server allowing home control.

Local

Python

MS-Lucidia-Voice-Gateway-MCP

A server providing text-to-speech and speech-to-text functionalities using Windows' native speech services without external dependencies.

Local

JavaScript

Zonos TTS MCP Server

Facilitates direct speech generation using Claude for multiple languages and emotions, integrating with a Zonos TTS setup via the Model Context Protocol.

Local

TypeScript