
MS-Lucidia-Voice-Gateway-MCP
A server providing text-to-speech and speech-to-text functionalities using Windows' native speech services without external dependencies.
ExpressionsBot
README
MS-Lucidia-Voice-Gateway-MCP
A Model Context Protocol (MCP) server that provides text-to-speech and speech-to-text capabilities using Windows' built-in speech services. This server leverages the native Windows Speech API (SAPI) through PowerShell commands, eliminating the need for external APIs or services.
Features
- Text-to-Speech (TTS) using Windows SAPI voices
- Speech-to-Text (STT) using Windows Speech Recognition
- Simple web interface for testing
- No external API dependencies
- Uses native Windows capabilities
Prerequisites
- Windows 10/11 with Speech Recognition enabled
- Node.js 16+
- PowerShell
Installation
- Clone the repository:
git clone https://github.com/ExpressionsBot/MS-Lucidia-Voice-Gateway-MCP.git
cd MS-Lucidia-Voice-Gateway-MCP
- Install dependencies:
npm install
- Build the project:
npm run build
Usage
Testing Interface
- Start the test server:
npm run test
- Open
http://localhost:3000
in your browser - Use the web interface to test TTS and STT capabilities
Available Tools
text_to_speech
Converts text to speech using Windows SAPI.
Parameters:
text
(required): The text to convert to speechvoice
(optional): The voice to use (e.g., "Microsoft David Desktop")speed
(optional): Speech rate from 0.5 to 2.0 (default: 1.0)
Example:
fetch('http://localhost:3000/tts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text: "Hello, this is a test",
voice: "Microsoft David Desktop",
speed: 1.0
})
});
speech_to_text
Records audio and converts it to text using Windows Speech Recognition.
Parameters:
duration
(optional): Recording duration in seconds (default: 5, max: 60)
Example:
fetch('http://localhost:3000/stt', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
duration: 5
})
}).then(response => response.json())
.then(data => console.log(data.text));
Troubleshooting
-
Make sure Windows Speech Recognition is enabled:
- Open Windows Settings
- Go to Time & Language > Speech
- Enable Speech Recognition
-
Check available voices:
- Open PowerShell and run:
Add-Type -AssemblyName System.Speech (New-Object System.Speech.Synthesis.SpeechSynthesizer).GetInstalledVoices().VoiceInfo.Name
-
Test speech recognition:
- Open Speech Recognition in Windows Settings
- Run through the setup wizard if not already done
- Test that Windows can recognize your voice
Contributing
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a new Pull Request
License
MIT
Recommended Servers
mcp-server-youtube-transcript
A Model Context Protocol server that enables retrieval of transcripts from YouTube videos. This server provides direct access to video captions and subtitles through a simple interface.
Zonos TTS MCP Server
Facilitates direct speech generation using Claude for multiple languages and emotions, integrating with a Zonos TTS setup via the Model Context Protocol.
Say MCP Server
Enables text-to-speech functionality on macOS using the say command, offering extensive control over speech parameters like voice, rate, volume, and pitch for a customizable auditory experience.
Ollama MCP Server
Enables seamless integration between Ollama's local LLM models and MCP-compatible applications, supporting model management and chat interactions.
mcp-hfspace
Use HuggingFace Spaces directly from Claude. Use Open Source Image Generation, Chat, Vision tasks and more. Supports Image, Audio and text uploads/downloads.

Kokoro TTS MCP Server
Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
Voice Recorder MCP Server
Enables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.

Speech MCP
A Goose MCP extension providing voice interaction with modern audio visualization, allowing users to communicate with Goose through speech rather than text.
Home Assistant MCP
Expose all Home Assistant voice intents through a Model Context Protocol Server allowing home control.

ElevenLabs Text-to-Speech MCP
Contribute to georgi-io/jessica development by creating an account on GitHub.