Voice Generation MCP Server

Voice Generation MCP Server

Enables text-to-speech conversion using Minimax AI's voice synthesis API with automatic upload to Amazon S3. Supports customizable voice settings including model selection, voice ID, and speech speed control.

Category
Visit Server

README

Voice Generation MCP Server

A Model Context Protocol (MCP) server that provides voice generation capabilities using the Minimax AI API. This server converts text to speech and automatically uploads the generated audio files to Amazon S3 for easy access and sharing.

Features

  • Text-to-Speech Generation: Convert text to high-quality speech using Minimax AI's voice synthesis API
  • S3 Integration: Automatically upload generated audio files to Amazon S3 with organized directory structure
  • MCP Protocol Support: Full compatibility with Model Context Protocol for seamless integration with AI assistants
  • Authentication: Built-in API key authentication for secure access
  • Multiple Transport Modes: Support for HTTP, SSE, and STDIO transport protocols
  • Docker Support: Easy deployment with Docker and Docker Compose
  • Configurable Audio Settings: Customizable sample rate, bitrate, and format options

Prerequisites

  • Python 3.8 or higher
  • Minimax AI API credentials
  • Amazon S3 bucket and credentials
  • (Optional) Docker and Docker Compose for containerized deployment

Installation

Local Installation

  1. Clone the repository

    git clone <repository-url>
    cd voice-gen-mcp
    
  2. Create a virtual environment

    python3 -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    
  3. Install dependencies

    pip install -r requirements.txt
    
  4. Configure environment variables

    cp env.example .env
    # Edit .env with your actual configuration values
    

Docker Installation

  1. Build the Docker image

    docker build -t voice-gen-mcp .
    
  2. Run with Docker Compose

    cp env.example .env
    # Edit .env with your configuration
    docker-compose up -d
    

Configuration

Environment Variables

Create a .env file based on env.example with the following required variables:

Voice Generation API (Required)

VOICE_GEN_API_GROUP_ID=your_minimax_group_id
VOICE_GEN_API_KEY=your_minimax_api_key

S3 Configuration (Required)

S3_BUCKET_NAME=your_s3_bucket_name
S3_REGION=us-east-1
S3_ACCESS_KEY_ID=your_s3_access_key_id
S3_SECRET_ACCESS_KEY=your_s3_secret_access_key
S3_ENDPOINT=https://s3.amazonaws.com
S3_PREFIX=voice-gen/

Usage

Starting the Server

Local Development

python3 server.py

Docker

docker run -d \
  --name voice-gen-mcp \
  -p 8000:8000 \
  --env-file .env \
  voice-gen-mcp

Docker Compose

docker-compose up -d

MCP Clients

The server supports multiple transport modes:

  • HTTP: http://localhost:8000/mcp
  • SSE: http://localhost:8000/sse
  • STDIO: Direct process communication

Available Tools

generate_voice

Converts text to speech and uploads to S3.

Parameters:

  • text (string, required): The text to convert to speech
  • model (string, optional): Model to use (default: "speech-2.5-hd-preview")
  • voice_id (string, optional): Voice ID to use (default: "mylxsw_voice_1")
  • speed (float, optional): Speech speed (default: 1.0, typically 0.5-2.0)

Returns:

  • Success message with S3 URL and file size
  • Error message if generation fails

Example:

{
  "text": "Hello, this is a test of the voice generation system.",
  "model": "speech-2.5-hd-preview",
  "voice_id": "mylxsw_voice_1",
  "speed": 1.2
}

Speed Control:

  • speed = 0.5: Half speed (slower speech)
  • speed = 1.0: Normal speed (default)
  • speed = 1.5: 1.5x speed (faster speech)
  • speed = 2.0: Double speed (very fast speech)

License

MIT License

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured