Voice Recognition MCP Service
Provides voice recognition and text extraction capabilities with support for both stdio and MCP modes, processing audio files or base64 encoded data and returning structured results with language, emotion, and speaker information.
README
Voice Recognition MCP Service
This service provides voice recognition and text extraction capabilities through both stdio and MCP modes.
Features
- Voice recognition from file
- Voice recognition from base64 encoded data
- Text extraction
- Support for both stdio and MCP modes
- Structured voice recognition results
Project Structure
voice_service.py- Core service implementationstdio_server.py- stdio mode entry pointmcp_server.py- MCP mode entry pointbuild.py- Build script for executablesbuild_exec.sh- Build execution scripttest_*.sh- Test scripts for different functionalities
Installation
- Clone the repository:
git clone https://github.com/AIO-2030/mcp_voice_identify.git
cd mcp_voice_identify
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables in
.env:
API_URL=your_api_url
API_KEY=your_api_key
Usage
stdio Mode
- Run the service:
python stdio_server.py
- Send JSON-RPC requests via stdin:
{
"jsonrpc": "2.0",
"method": "help",
"params": {},
"id": 1
}
- Or use the executable:
./dist/voice_stdio
MCP Mode
- Run the service:
python mcp_server.py
- Or use the executable:
./dist/voice_mcp
Voice Recognition Results
The service provides structured voice recognition results. Here's an example of the response format:
Original API Response
{
"jsonrpc": "2.0",
"result": {
"message": "input processed successfully",
"results": "test test test",
"label_result": "<|en|><|EMO_UNKNOWN|><|Speech|><|woitn|>test test test"
},
"id": 1
}
Restructured Response
{
"jsonrpc": "2.0",
"result": {
"message": "input processed successfully",
"results": "test test test",
"label_result": {
"lan": "en",
"emo": "unknown",
"type": "speech",
"speaker": "woitn",
"text": "test test test"
}
},
"id": 1
}
Label Result Fields
The label_result field contains the following structured information:
| Field | Description | Example Value |
|---|---|---|
| lan | Language code | "en" |
| emo | Emotion state | "unknown" |
| type | Audio type | "speech" |
| speaker | Speaker identifier | "woitn" |
| text | Recognized text content | "test test test" |
Special Labels
The service recognizes and processes the following special labels in the original response:
<|en|>- Language code<|EMO_UNKNOWN|>- Emotion state<|Speech|>- Audio type<|woitn|>- Speaker identifier
Building Executables
- Make the build script executable:
chmod +x build_exec.sh
- Build stdio mode executable:
./build_exec.sh
- Build MCP mode executable:
./build_exec.sh mcp
The executables will be created at:
- stdio mode:
dist/voice_stdio - MCP mode:
dist/voice_mcp
Testing
Run the test scripts:
chmod +x test_*.sh
./test_help.sh
./test_voice_file.sh
./test_voice_base64.sh
License
This project is licensed under the MIT License - see the LICENSE file for details.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
E2B
Using MCP to run code via e2b.