MCP Servers

Voice Recognition MCP Service

Provides voice recognition and text extraction capabilities with support for both stdio and MCP modes, processing audio files or base64 encoded data and returning structured results with language, emotion, and speaker information.

README

Voice Recognition MCP Service

This service provides voice recognition and text extraction capabilities through both stdio and MCP modes.

Features

Voice recognition from file
Voice recognition from base64 encoded data
Text extraction
Support for both stdio and MCP modes
Structured voice recognition results

Project Structure

voice_service.py - Core service implementation
stdio_server.py - stdio mode entry point
mcp_server.py - MCP mode entry point
build.py - Build script for executables
build_exec.sh - Build execution script
test_*.sh - Test scripts for different functionalities

Installation

Clone the repository:

git clone https://github.com/AIO-2030/mcp_voice_identify.git
cd mcp_voice_identify

Install dependencies:

pip install -r requirements.txt

Set up environment variables in .env:

API_URL=your_api_url
API_KEY=your_api_key

Usage

stdio Mode

Run the service:

python stdio_server.py

Send JSON-RPC requests via stdin:

{
    "jsonrpc": "2.0",
    "method": "help",
    "params": {},
    "id": 1
}

Or use the executable:

./dist/voice_stdio

MCP Mode

Run the service:

python mcp_server.py

Or use the executable:

./dist/voice_mcp

Voice Recognition Results

The service provides structured voice recognition results. Here's an example of the response format:

Original API Response

{
    "jsonrpc": "2.0",
    "result": {
        "message": "input processed successfully",
        "results": "test test test",
        "label_result": "<|en|><|EMO_UNKNOWN|><|Speech|><|woitn|>test test test"
    },
    "id": 1
}

Restructured Response

{
    "jsonrpc": "2.0",
    "result": {
        "message": "input processed successfully",
        "results": "test test test",
        "label_result": {
            "lan": "en",
            "emo": "unknown",
            "type": "speech",
            "speaker": "woitn",
            "text": "test test test"
        }
    },
    "id": 1
}

Label Result Fields

The label_result field contains the following structured information:

Field	Description	Example Value
lan	Language code	"en"
emo	Emotion state	"unknown"
type	Audio type	"speech"
speaker	Speaker identifier	"woitn"
text	Recognized text content	"test test test"

Special Labels

The service recognizes and processes the following special labels in the original response:

<|en|> - Language code
<|EMO_UNKNOWN|> - Emotion state
<|Speech|> - Audio type
<|woitn|> - Speaker identifier

Building Executables

Make the build script executable:

chmod +x build_exec.sh

Build stdio mode executable:

./build_exec.sh

Build MCP mode executable:

./build_exec.sh mcp

The executables will be created at:

stdio mode: dist/voice_stdio
MCP mode: dist/voice_mcp

Testing

Run the test scripts:

chmod +x test_*.sh
./test_help.sh
./test_voice_file.sh
./test_voice_base64.sh

License

This project is licensed under the MIT License - see the LICENSE file for details.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

E2B

Using MCP to run code via e2b.

Official

Featured