MCP Servers

MS-Lucidia-Voice-Gateway-MCP

A server providing text-to-speech and speech-to-text functionalities using Windows' native speech services without external dependencies.

ExpressionsBot

Speech Processing

Visit Server

README

MS-Lucidia-Voice-Gateway-MCP

A Model Context Protocol (MCP) server that provides text-to-speech and speech-to-text capabilities using Windows' built-in speech services. This server leverages the native Windows Speech API (SAPI) through PowerShell commands, eliminating the need for external APIs or services.

Features

Text-to-Speech (TTS) using Windows SAPI voices
Speech-to-Text (STT) using Windows Speech Recognition
Simple web interface for testing
No external API dependencies
Uses native Windows capabilities

Prerequisites

Windows 10/11 with Speech Recognition enabled
Node.js 16+
PowerShell

Installation

Clone the repository:

git clone https://github.com/ExpressionsBot/MS-Lucidia-Voice-Gateway-MCP.git
cd MS-Lucidia-Voice-Gateway-MCP

Install dependencies:

npm install

Build the project:

npm run build

Usage

Testing Interface

Start the test server:

npm run test

Open http://localhost:3000 in your browser
Use the web interface to test TTS and STT capabilities

Available Tools

text_to_speech

Converts text to speech using Windows SAPI.

Parameters:

text (required): The text to convert to speech
voice (optional): The voice to use (e.g., "Microsoft David Desktop")
speed (optional): Speech rate from 0.5 to 2.0 (default: 1.0)

Example:

fetch('http://localhost:3000/tts', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    text: "Hello, this is a test",
    voice: "Microsoft David Desktop",
    speed: 1.0
  })
});

speech_to_text

Records audio and converts it to text using Windows Speech Recognition.

Parameters:

duration (optional): Recording duration in seconds (default: 5, max: 60)

Example:

fetch('http://localhost:3000/stt', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    duration: 5
  })
}).then(response => response.json())
  .then(data => console.log(data.text));

Troubleshooting

Make sure Windows Speech Recognition is enabled:
- Open Windows Settings
- Go to Time & Language > Speech
- Enable Speech Recognition

Check available voices:

Open PowerShell and run:

Add-Type -AssemblyName System.Speech
(New-Object System.Speech.Synthesis.SpeechSynthesizer).GetInstalledVoices().VoiceInfo.Name

Test speech recognition:
- Open Speech Recognition in Windows Settings
- Run through the setup wizard if not already done
- Test that Windows can recognize your voice

Contributing

Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Create a new Pull Request

License

MIT

Recommended Servers

mcp-server-youtube-transcript

A Model Context Protocol server that enables retrieval of transcripts from YouTube videos. This server provides direct access to video captions and subtitles through a simple interface.

Featured

JavaScript

Zonos TTS MCP Server

Facilitates direct speech generation using Claude for multiple languages and emotions, integrating with a Zonos TTS setup via the Model Context Protocol.

Local

TypeScript

Say MCP Server

Enables text-to-speech functionality on macOS using the say command, offering extensive control over speech parameters like voice, rate, volume, and pitch for a customizable auditory experience.

Local

JavaScript

Ollama MCP Server

Enables seamless integration between Ollama's local LLM models and MCP-compatible applications, supporting model management and chat interactions.

Local

TypeScript

mcp-hfspace

Use HuggingFace Spaces directly from Claude. Use Open Source Image Generation, Chat, Vision tasks and more. Supports Image, Audio and text uploads/downloads.

Local

TypeScript

Home Assistant MCP

Expose all Home Assistant voice intents through a Model Context Protocol Server allowing home control.

Local

Python

Speech MCP

A Goose MCP extension providing voice interaction with modern audio visualization, allowing users to communicate with Goose through speech rather than text.

Local

Python

Voice Recorder MCP Server

Enables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.

Local

Python

Kokoro TTS MCP Server

Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.

Local

Python

ElevenLabs Text-to-Speech MCP

Contribute to georgi-io/jessica development by creating an account on GitHub.

Local

Python