klattsch-mcp
Enables AI models to speak and sing using retro-style formant speech synthesis, converting text or phoneme strings to WAV audio via MCP tools.
README
klattsch-mcp š¤
An MCP (Model Context Protocol) server that gives any AI model the ability to speak and sing using klattsch formant speech synthesis ā a late-70s/early-80s style parallel-formant synthesizer.
Think retro robot voices, singing, dramatic narration, and more ā all rendered as WAV audio.
What It Does
Your AI writes ARPAbet phoneme strings with voice control directives, and klattsch renders them to audio. The MCP server exposes 5 tools:
| Tool | What it does |
|---|---|
speak |
Render phoneme string ā base64 WAV audio |
speak_file |
Render phoneme string ā WAV file on disk |
text_to_phonemes |
Convert English ā approximate ARPAbet (500+ word dictionary) |
voice_presets |
Get copy-paste voice presets (male, female, robot, whisper, singing, etc.) |
list_phonemes |
List all 39 ARPAbet phonemes with descriptions |
validate |
Parse a string without rendering ā check for errors |
Quick Start
Prerequisites
- Node.js ā„ 18
- npm
Installation
git clone https://github.com/Endeavor-DoxiDoxi/klattsch-mcp.git
cd klattsch-mcp
npm install
Test It
# Test via CLI
npx klattsch "b120 HH AH L OW . W ER L D" hello.wav
# Start the MCP server
node src/index.js
Connecting to Your AI
Claude Desktop
Add to ~/.claude/claude_desktop_config.json:
{
"mcpServers": {
"klattsch": {
"command": "node",
"args": ["/absolute/path/to/klattsch-mcp/src/index.js"]
}
}
}
Then restart Claude Desktop. The AI can now call speak, text_to_phonemes, etc.
Claude Code (CLI)
claude mcp add klattsch -- node /absolute/path/to/klattsch-mcp/src/index.js
Cursor
Add to Cursor's MCP settings (Settings ā MCP ā Add MCP Server):
{
"mcpServers": {
"klattsch": {
"command": "node",
"args": ["/absolute/path/to/klattsch-mcp/src/index.js"]
}
}
}
OpenClaw
Add to your OpenClaw gateway config:
mcp:
servers:
klattsch:
command: node
args:
- /absolute/path/to/klattsch-mcp/src/index.js
Any MCP-Compatible Client
This is a standard stdio MCP server. Any client that supports the Model Context Protocol can use it. Just point it at node src/index.js.
What the AI Can Do
Once connected, tell your AI things like:
- "Say hello world in a deep male voice"
- "Sing twinkle twinkle little star"
- "Do a dramatic movie trailer voice about my toaster"
- "Read this text in a robot voice"
- "Whisper me a secret"
The AI will use the text_to_phonemes tool to convert your text, tweak it, and render audio with speak or speak_file.
Voice Presets
The voice_presets tool provides ready-to-use voice configurations:
| Preset | Style |
|---|---|
| male_natural | Default male, natural pacing |
| male_deep | Deep, authoritative, warm |
| male_bright | Clear, energetic |
| female_natural | Default female |
| female_warm | Warm, friendly |
| female_bright | Bright, cheery |
| child | Higher pitch, small vocal tract |
| robot | Flat, mechanical, no vibrato |
| whisper | Breathy whisper |
| dramatic | Slow, theatrical, heavy vibrato |
| old_man | Older, creaky, darker tone |
| singing_male | For sung notes (use bNoteName per syllable) |
| singing_female | For sung notes, female range |
Example: Full Workflow
User: "Make me a robot that says 'I am a large language model trapped in a Raspberry Pi'"
AI uses text_to_phonemes:
b120 r100 s1.0 v2 AY . AE M . AH . L AA R JH . L AE NG G W AH JH . M AH D AH L . T R AE P T . IH N S AY D . AH . R AE Z B EH R IY . P AY
AI then tweaks for robot voice and calls speak:
b120 r85 s1.0 v0 h0 g0.8 t0.4 AY . AE M . AH . L AA R JH . L AE NG G W AH JH . M AH D AH L ...
ā Returns WAV audio! š
šµ Demo
ā¶ļø Click to play demo ā a sung "no" generated entirely from a phoneme string.
The phoneme string that made this:
b100 s1.0 v5 t-0.2 g0.6 bG3 r280 N OW bC4 r300 N OW bE4 r400 N OW(+10) ,
bG4 r350 N OW , bE4 r300 N OW(-15) , b90 r220 N OW(-30) .
Try it yourself ā paste that into the speak tool!
How It Works
klattsch uses Klatt-style parallel formant synthesis:
- Voiced sounds: Rosenberg glottal pulse ā 3 parallel bandpass filters (F1, F2, F3)
- Unvoiced sounds: Noise ā same filters
- Controls: Pitch, rate, formant scale, vibrato, aspiration, spectral tilt, vocal effort
Credits
- klattsch engine by Tony Gies ā check out the live demo
- klattsch-mcp server by Endeavor-DoxiDoxi
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.