MCP Servers

Gemini Transcription MCP

An MCP server for audio-to-text transcription using Google's Gemini API via OpenRouter, offering multiple tools for raw, cleaned, or formatted transcripts with support for local and remote deployment.

README

Gemini Transcription MCP

An MCP server for audio-to-text transcription using Google's Gemini multimodal API.

Quick Start

Claude Code (Recommended)

claude mcp add gemini-transcription -s user \
  -e OPENROUTER_API_KEY=your-key \
  -- npx -y gemini-transcription-mcp

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "gemini-transcription": {
      "command": "npx",
      "args": ["-y", "gemini-transcription-mcp"],
      "env": {
        "OPENROUTER_API_KEY": "your-key"
      }
    }
  }
}

MetaMCP

Add via the MetaMCP UI or import JSON:

{
  "mcpServers": {
    "gemini-transcription": {
      "command": "npx",
      "args": ["-y", "gemini-transcription-mcp"],
      "env": {
        "OPENROUTER_API_KEY": "your-key"
      },
      "description": "Audio transcription using Gemini models via OpenRouter"
    }
  }
}

Or fill in the Add Server form manually:

Field	Value
Command	`npx`
Arguments	`-y gemini-transcription-mcp`
Environment Variables	`OPENROUTER_API_KEY=your-key`

Remote Deployment (HTTP Transport)

For deployments that require HTTP transport:

# Using Docker (recommended for remote)
docker run -d \
  -p 3000:3000 \
  -e OPENROUTER_API_KEY=your-key \
  ghcr.io/danielrosehill/gemini-transcription-mcp

# Or run directly with HTTP transport
OPENROUTER_API_KEY=your-key npx gemini-transcription-mcp --http 3000

The server exposes:

http://host:3000/mcp - MCP endpoint (streamable HTTP)
http://host:3000/health - Health check

Tools

Tool	Description
`transcribe_audio`	Lightly edited transcript (removes filler words, applies corrections)
`transcribe_audio_raw`	Verbatim transcript with no cleanup
`transcribe_audio_vad`	VAD preprocessing to strip silence before transcription
`transcribe_audio_format`	Transcribe and format as a document type (email, to-do list, etc.)
`transcribe_audio_large`	Compresses oversized files to Opus before transcribing
`transcribe_audio_custom`	Full control with your own prompt
`transcribe_audio_devspec`	Format as a development specification for AI coding agents

Input Methods

All tools accept audio via:

file_content: Base64-encoded audio
file_url: HTTP(S) URL to fetch
ssh_host + ssh_path: Pull via SCP (local deployment only)

Supported Formats

Native: MP3, WAV, OGG, FLAC, AAC, AIFF
Auto-converted: Opus, M4A, WebM, WMA, and others (converted to OGG/Opus)

Note: When manually converting audio, prefer MP3 over WAV. MP3 offers good compression with broad compatibility, while WAV files are unnecessarily large.

Configuration

Environment Variable	Description
`OPENROUTER_API_KEY`	Required. Your OpenRouter API key
`OPENROUTER_MODEL`	Optional. Model to use (default: Gemini Flash Lite)
`TRANSCRIPT_OUTPUT_DIR`	Optional. Auto-save location (default: `./transcripts`). Set to empty string to disable.
`MCP_TRANSPORT`	Optional. Set to `http` for HTTP transport mode
`MCP_PORT`	Optional. Port for HTTP mode (default: `3000`)

Deployment Options

Local (Claude Code, Claude Desktop)

Uses stdio transport. All features available including SSH file retrieval.

# Via npx (recommended)
npx gemini-transcription-mcp

# Or install globally
npm install -g gemini-transcription-mcp
gemini-transcription-mcp

Remote/Docker (MetaMCP, Aggregators)

Uses HTTP transport. Requires container or server with ffmpeg installed.

Docker Compose:

# docker-compose.yml
services:
  gemini-transcription:
    image: ghcr.io/danielrosehill/gemini-transcription-mcp
    ports:
      - "3000:3000"
    environment:
      - OPENROUTER_API_KEY=${OPENROUTER_API_KEY}

# Create .env file with your API key
echo "OPENROUTER_API_KEY=your-key" > .env

# Start the service
docker compose up -d

Feature Availability by Deployment Type

Feature	Local (stdio)	Remote (HTTP)
Base64 audio input	Yes	Yes
URL audio input	Yes	Yes
SSH file retrieval	Yes	No*
Transcript auto-save	Yes	Container volume
VAD preprocessing	Yes	Yes
Format conversion	Yes	Yes

* SSH retrieval requires local access to SSH keys and network.

Requirements

Node.js 18+
ffmpeg (for format conversion and VAD preprocessing)
OpenRouter API key

When using Docker, ffmpeg is included in the image.

Building from Source

git clone https://github.com/danielrosehill/Gemini-Transcription-MCP.git
cd Gemini-Transcription-MCP
npm install
npm run build

# Run locally
OPENROUTER_API_KEY=your-key npm start

# Run with HTTP transport
OPENROUTER_API_KEY=your-key MCP_TRANSPORT=http npm start

# Build Docker image
docker build -t gemini-transcription-mcp .

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured