AI Sound MCP Server

AI Sound MCP Server

Enables AI assistants to programmatically edit, analyze, and export audio projects through MCP tools, including multi-track editing, effects, transcription, and semantic search.

Category
Visit Server

README

<p align="center"> <img src="docs/header.png" alt="AI Sound — AI-Native Audio Editor" width="100%"> </p>

<p align="center"> <strong>An AI-native audio editor that works with any OpenAI-compatible LLM.</strong> </p>

<p align="center"> <img src="https://img.shields.io/badge/runtime-Bun-f9f1e1?logo=bun" alt="Bun"> <img src="https://img.shields.io/badge/React-19-61dafb?logo=react" alt="React 19"> <img src="https://img.shields.io/badge/MCP-compatible-blue" alt="MCP Compatible"> <img src="https://img.shields.io/badge/license-MIT-green" alt="MIT License"> </p>


<p align="center"> <img src="assets/ai-audio-screenshot.png" alt="AI Sound Screenshot" width="90%"> </p>

What is AI Sound?

AI Sound is an AI-native audio editor designed as a modern replacement for desktop tools like Audacity. Instead of bolting AI onto an existing app, AI Sound is built from the ground up with LLM integration at its core — enabling conversational audio editing, automatic transcription, speaker diarization, and semantic search across your audio content.

It works with any OpenAI-compatible API — run it fully local with Ollama, or connect to OpenAI, Groq, or any other compatible provider. No vendor lock-in, no API keys required for local use.

AI Sound also exposes a full MCP (Model Context Protocol) server, letting AI assistants like Claude Desktop directly edit, analyze, and export your audio projects.

Key Features

  • Multi-track editing — import, arrange, and mix multiple audio tracks
  • AI-powered transcription — speech-to-text with word-level timestamps
  • Speaker diarization — automatically identify and split by speaker
  • Semantic search — find content by meaning, not just keywords
  • Audio effects — normalize, compress, EQ, reverb, noise reduction, fade, pitch shift, speed
  • Non-destructive editing — full undo/redo history
  • Export — export individual tracks or full project mixes
  • MCP integration — expose all editing tools to AI assistants

Quickstart

Prerequisites

  • Bun — JavaScript/TypeScript runtime
  • FFmpeg — audio processing (brew install ffmpeg on macOS)
  • An OpenAI-compatible LLM — Ollama for local, or any cloud provider

Install & Run

git clone https://github.com/your-username/ai-sound.git
cd ai-sound
bun install
bun run dev

Open http://localhost:5175 in your browser.

On first launch, configure your LLM provider in Settings (gear icon). The default is Ollama at localhost:11434.

LLM Configuration

AI Sound works with any OpenAI-compatible API. Configure your provider in-app via Settings — no .env files needed.

Provider Base URL Model Example
Ollama (local) http://localhost:11434/v1 llama3.2
OpenAI https://api.openai.com/v1 gpt-4o
Anthropic (via proxy) provider-specific claude-sonnet-4-20250514
Groq https://api.groq.com/openai/v1 llama-3.3-70b

For cloud providers, enter your API key in the Settings panel. For Ollama, no API key is needed.

MCP Server

AI Sound includes a built-in Model Context Protocol server, allowing AI assistants like Claude Desktop to interact with your audio projects programmatically.

Claude Desktop Configuration

Add the following to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "ai-sound": {
      "command": "bun",
      "args": ["run", "/absolute/path/to/ai-sound/server/lib/mcp/server.ts"]
    }
  }
}

Replace /absolute/path/to/ai-sound with the actual path to your installation.

Available Tools

Project Management

  • list_projects — List all projects
  • set_active_project — Set the active project for subsequent operations
  • get_project_status — Get all tracks, durations, regions, and transcriptions
  • get_track_info — Get detailed info about a specific track

Audio Effects & Processing

  • normalize_audio — Normalize audio levels
  • adjust_volume — Adjust volume by relative dB amount
  • trim_audio — Trim to a specific time range
  • remove_silence — Detect and remove silent sections
  • apply_fade — Apply fade in/out
  • apply_effect — Apply effects: noise reduction, compressor, EQ, reverb, speed, pitch shift

Segment Operations

  • remove_segments — Remove multiple time ranges from a track
  • replace_audio_segment — Replace a time range with silence or a beep

Track Operations

  • rename_track — Rename a track
  • delete_track — Delete a track and its audio
  • merge_tracks — Merge multiple tracks into one
  • duplicate_track — Duplicate a track
  • export_audio — Export a track or full project mix

Transcription & Search

  • transcribe_track — Transcribe audio using speech-to-text
  • split_by_speaker — Split a track by speaker
  • rename_speaker — Rename a speaker label
  • search_transcription — Search transcription text by pattern
  • search_transcript_semantic — Semantic search across transcriptions
  • copy_transcriptions — Copy transcription data between tracks

Tech Stack

Layer Technology
Runtime Bun
Server Hono
Database SQLite via Drizzle ORM
Frontend React 19 + Tailwind CSS 4
Audio wavesurfer.js + FFmpeg
AI Integration OpenAI-compatible API + MCP SDK

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured