MCP Servers

MCP Audio RAG Server

Enables transcription of audio files using Google Gemini AI and creates a searchable knowledge base, allowing users to ask natural language questions about content in meetings, podcasts, lectures, and other audio recordings.

README

MCP Audio RAG Server

Transform your audio files into a searchable knowledge base using AI. Ask Claude questions about your meetings, podcasts, lectures, or any audio content.

What is this?

This is an MCP (Model Context Protocol) server that lets you:

Transcribe any audio file using Google's Gemini AI
Store the transcriptions in a searchable database
Search through all your audio content using natural language

Once set up, you can simply ask Claude things like:

"What did they discuss about the budget in my meeting recording?"
"Find mentions of machine learning in my podcast collection"
"What were the key points from yesterday's lecture?"

How It Works

┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│ Audio File  │ ──▶ │   Gemini    │ ──▶ │  Chunking   │ ──▶ │  Supabase   │
│ (.mp3, etc) │     │ Transcribe  │     │ + Embedding │     │  (pgvector) │
└─────────────┘     └─────────────┘     └─────────────┘     └─────────────┘
                                                                   │
┌─────────────┐     ┌─────────────┐     ┌─────────────┐            │
│   Claude    │ ◀── │   Results   │ ◀── │   Search    │ ◀──────────┘
│  Response   │     │ + Snippets  │     │   Query     │
└─────────────┘     └─────────────┘     └─────────────┘

Quick Start

Prerequisites

Node.js 18+ - Download here
Gemini API Key - Get one free
Supabase Account - Sign up free

Step 1: Clone & Install

git clone https://github.com/matheusslg/mcp-audio-rag.git
cd mcp-audio-rag
npm install

Step 2: Set Up Supabase Database

Create a new project at supabase.com
Go to SQL Editor in your dashboard
Paste and run the contents of supabase/schema.sql

Step 3: Get Your API Keys

Supabase (Settings → API):

Copy Project URL → SUPABASE_URL
Copy service_role key → SUPABASE_SERVICE_KEY

Google AI Studio:

Create key at aistudio.google.com/apikey → GEMINI_API_KEY

Step 4: Configure

cp .env.example .env

Edit .env:

GEMINI_API_KEY=your-key-here
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_KEY=your-service-role-key

Step 5: Add to Claude

For Claude Code CLI (~/.claude.json):

{
  "mcpServers": {
    "audio-rag": {
      "command": "npx",
      "args": ["tsx", "/full/path/to/mcp-audio-rag/src/server.ts"],
      "env": {
        "GEMINI_API_KEY": "your-key",
        "SUPABASE_URL": "https://your-project.supabase.co",
        "SUPABASE_SERVICE_KEY": "your-service-role-key"
      }
    }
  }
}

For Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json on Mac):

Same config as above.

Usage

Transcribe Audio

Just tell Claude to transcribe a file:

Transcribe /path/to/meeting.mp3

Want to use a specific model? Just ask:

Transcribe /path/to/lecture.m4a using gemini-2.5-pro

Search Your Audio

Ask natural questions:

What did they say about the project timeline?
Search for mentions of "budget" in my recordings
Find discussions about AI in my podcasts

Manage Your Library

List all my transcribed audio files
Delete the recording from last week
Get the full transcript of meeting.mp3
Summarize the podcast episode

Available Models

Model	Best For
`gemini-2.5-flash`	Default - Fast & accurate, great balance
`gemini-2.5-flash-lite`	Fastest, cheapest - good for bulk processing
`gemini-2.5-pro`	Best quality - complex audio, multiple speakers
`gemini-3-pro-preview`	Newest - cutting edge capabilities
`gemini-2.0-flash`	Reliable - previous generation
`gemini-2.0-flash-lite`	Fast - previous generation

Supported Audio Formats

.mp3 .mp4 .m4a .wav .webm .mpeg .mpga

Available Tools

Tool	Description
`ingest_audio`	Transcribe and store an audio file
`search_transcripts`	Search through your audio using natural language
`list_transcripts`	List all transcribed audio files
`get_full_transcript`	Get the complete transcript of a file
`summarize_audio`	Generate an AI summary of a transcript
`delete_transcript`	Remove a transcribed file from the database

Troubleshooting

Problem	Solution
"No relevant segments found"	Try rephrasing your search, or check if audio was ingested
"Missing environment variable"	Check your `.env` file or Claude config has all 3 keys
Supabase errors	Make sure you're using `service_role` key, not `anon` key
Slow transcription	Use `gemini-2.5-flash-lite` for faster processing

Support This Project

If this project saved you time or helped you out, consider buying me a coffee!

License

MIT - Use it however you want!

<p align="center"> Made with Gemini + Supabase + Claude </p>

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured