MCP Servers

phonecall-mcp

Enables Claude to make and manage real phone calls through Twilio, allowing it to think, search, and respond in real-time conversations.

README

phonecall-mcp

An MCP server that enables Claude to make and manage real phone calls through Twilio. Claude becomes the brain behind the conversation — thinking, searching, and responding — while the phone is just an I/O channel.

What makes this different? Most phone AI solutions put a pre-prompted, standalone agent on the line. Here, your full Claude instance — with all its tools (web search, calendar, Drive, email, etc.) — handles the call. It's not a script-following bot, but a reflective assistant that adapts to the conversation in real time.

How It Works

Claude Desktop (MCP client)
    ↕ MCP protocol (stdio)
phonecall-mcp server (Python, local)
    ↕ WebSocket (Twilio Media Streams)
Twilio (PSTN telephone network)
    ↕
Callee's phone

Parallel services:

phonecall-mcp server
    → ElevenLabs Scribe v2 Realtime (STT: speech → text)
    → ElevenLabs TTS (text → speech)

The entire audio pipeline runs natively in μ-law 8kHz — no format conversion needed.

Features

Outbound calls — Claude initiates calls on your behalf, with natural TTS voice
Real-time transcription — ElevenLabs Scribe v2 transcribes the callee's speech live
GDPR consent — Built-in consent mechanism: no transcription occurs until the callee explicitly agrees (DTMF "5"). Enforced at the server level.
DTMF turn-taking — Callee presses "1" when done speaking, press "1" to interrupt (barge-in)
Tool use during calls — Claude can search the web, check calendars, look up documents — all while the callee hears a hold tone
Inbound voicemail — Incoming calls are handled as a voicemail recorder
Multi-language — Works in any language supported by ElevenLabs (tested: Hungarian, English, German)
Full transcript — Every call produces a timestamped transcript with speaker labels

Prerequisites

Python 3.12+
Twilio account with a phone number (console.twilio.com)
ElevenLabs account with API access (elevenlabs.io)
ngrok account with a static domain (ngrok.com) — free tier works
Claude Desktop app

Setup

1. Clone the repository

git clone https://github.com/leszini/phonecall-mcp.git
cd phonecall-mcp

2. Install dependencies

pip install -r requirements.txt

On Windows, the tzdata package (included in requirements.txt) is required for timezone support.

3. Download ngrok

Download the ngrok binary for your platform from ngrok.com/download and place ngrok.exe (or ngrok) in the project directory. Authenticate it with your ngrok auth token:

ngrok config add-authtoken YOUR_AUTH_TOKEN

If you have a static ngrok domain (recommended), note it for the next step.

4. Configure environment variables

Copy the example files:

cp .env.example .env
cp config.example.json config.json

Edit .env with your credentials:

TWILIO_ACCOUNT_SID=your_account_sid
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_PHONE_NUMBER=+1234567890

ELEVENLABS_API_KEY=your_elevenlabs_api_key

NGROK_URL=https://your-subdomain.ngrok-free.app

5. Configure config.json

Edit config.json to set your preferences:

{
  "timezone": "UTC",
  "tts": {
    "voice_id": "YOUR_VOICE_ID_HERE",
    "model_id": "eleven_v3",
    "language_code": "en"
  },
  "stt": {
    "model_id": "scribe_v2_realtime",
    "language_code": "en",
    "vad_silence_threshold": 1.5
  },
  "server": {
    "host": "0.0.0.0",
    "port": 8765
  },
  "call": {
    "max_duration_seconds": 900
  },
  "voicemail": {
    "default_language": "en",
    "local_prefixes": [],
    "greeting": {
      "en": "Hello! You've reached the voicemail. Please leave your message after the tone. When you're done, press 1 on your phone. Thank you!"
    },
    "thanks": {
      "en": "Thank you for your message. Goodbye!"
    }
  }
}

Key settings to customize:

Setting	Description
`timezone`	Your local timezone (e.g. `"America/New_York"`, `"Europe/London"`)
`tts.voice_id`	Your ElevenLabs voice ID (find it in your ElevenLabs dashboard)
`tts.language_code`	Default TTS language code
`stt.language_code`	Default STT language code
`voicemail.default_language`	Language for voicemail greetings
`voicemail.local_prefixes`	Phone prefixes for your country (e.g. `["+1"]` for US, `["+44"]` for UK). Calls from these prefixes use `default_language`; others get English.
`voicemail.greeting`	Add greetings in any language you need
`voicemail.thanks`	Add thank-you messages in matching languages

6. Configure Claude Desktop

Add the MCP server to your Claude Desktop configuration. Open Claude Desktop's settings and add a new MCP server, or edit the config file directly:

Config file location:

Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

Add the phonecall-mcp entry under mcpServers:

{
  "mcpServers": {
    "phonecall-mcp": {
      "command": "python",
      "args": [
        "/absolute/path/to/phonecall-mcp/server.py"
      ]
    }
  }
}

Replace /absolute/path/to/phonecall-mcp/server.py with the actual path on your system.

7. Start ngrok

Before making calls, start the ngrok tunnel. You can use the included launcher (Windows):

Double-click ngrok_launcher.pyw

Or start it manually:

ngrok http 8765 --domain=your-subdomain.ngrok-free.app

8. Restart Claude Desktop

Restart Claude Desktop so it picks up the new MCP server configuration. You should see phonecall-mcp in the available tools.

Usage

Making a call

Simply ask Claude to call someone:

"Call +1234567890 and ask about their business hours."

Claude will:

Tell you who it's calling and why, and wait for your approval
Start the call with a GDPR-compliant greeting
Wait for the callee's consent (DTMF "5")
Conduct the conversation
Provide you with a transcript and summary after hanging up

The call flow

1. Claude plays GDPR notice + "press 5 to consent"
2. Callee presses 5 → transcription starts, beeping confirms
3. Claude sends confirmation + explains "press 1 when done"
4. Callee speaks → presses 1 → Claude receives transcript
5. Claude thinks (callee hears beeping) → responds via TTS
6. Repeat 4-5 until conversation ends
7. Claude hangs up → provides transcript + summary

GDPR consent mechanism

Every outbound call includes a mandatory consent step:

The callee hears who is calling, why, and that the conversation will be transcribed
The callee must press 5 on their phone to consent
No transcription occurs before consent — this is enforced at the server level (STT engine does not start until DTMF "5" is received)
If the callee doesn't press 5 within 30 seconds, Claude politely ends the call
If the callee hangs up, nothing was recorded

Inbound calls (voicemail)

When someone calls your Twilio number, the server automatically handles it as a voicemail:

Plays a greeting in the configured language
Records the caller's message
Logs the transcript

You can ask Claude: "Did I get any calls?" and it will check the call log.

Using the skill file

The SKILL_EN.md file teaches Claude how to use phonecall-mcp effectively — including GDPR notice templates, conversation flow, and best practices. To use it:

Copy SKILL_EN.md to your Claude Desktop skills directory
Customize the templates with your name and preferences
Claude will automatically follow the skill guidelines when making calls

MCP Tools

Tool	Description
`phone_call_start`	Initiate an outbound call
`phone_call_listen`	Wait for callee to speak + press DTMF "1"
`phone_call_respond`	Send a spoken response via TTS
`phone_call_control`	Check call status
`phone_call_end`	Hang up and get transcript

Architecture

State machine (AudioBridge)

IDLE → CONSENT_PENDING → LISTENING ⇄ PROCESSING → SPEAKING → LISTENING
                              ↑                                    |
                              └────────────────────────────────────┘

IDLE — Call not yet connected
CONSENT_PENDING — Waiting for GDPR consent (DTMF "5"), no STT active
LISTENING — Routing callee audio to STT, waiting for DTMF "1"
PROCESSING — DTMF received, Claude is thinking (hold tone plays)
SPEAKING — TTS audio being sent to Twilio

Key design decisions

DTMF-based turn-taking instead of silence detection — more reliable on phone lines with background noise
Server-enforced GDPR consent — STT physically cannot start before consent, regardless of what Claude sends
Pre-rendered audio — First message TTS is synthesized while the phone rings, eliminating the initial silence gap
Scribe audio throttling — During PROCESSING state, only keepalive packets are sent to STT to prevent resource_exhausted errors
Barge-in support — Callee can interrupt Claude by pressing "1" during TTS playback

Security Considerations

Caller data extraction risk

⚠️ Important: During a phone call, Claude has access to all its connected tools — web search, Google Drive, email, calendar, and any other MCP connectors you have enabled. This means the callee could potentially extract sensitive information from Claude through social engineering (e.g., "Can you check if there's a document about..." or "What's on the calendar for...").

Recommended mitigations:

Disable sensitive connectors during calls — temporarily turn off MCP connectors that access private data (Drive, email, calendar) if the call doesn't require them
Define boundaries in your skill file — explicitly instruct Claude what it should NOT do during calls (e.g., "Never read emails or share calendar details during a call", "Only use web search, no internal tools")
Review transcripts — check call transcripts for any unexpected information disclosure

The tool use capability during calls is powerful — Claude can look up information in real time to help the conversation. But that same power means you should carefully consider which tools Claude should have access to for each call.

API keys and credentials

All secrets are stored in .env (excluded from git via .gitignore)
Never commit .env or config.json to version control
The Twilio auth token is used for request validation — keep it secure

GDPR consent

The built-in consent mechanism ensures no transcription occurs without the callee's explicit agreement. However, this is a technical safeguard — you are responsible for ensuring your use of this tool complies with applicable privacy laws in your jurisdiction.

Troubleshooting

Call fails immediately

Check that ngrok is running and the URL in .env matches
Verify your Twilio credentials and phone number
Check logs/phonecall-mcp.log for error details

No audio / silence after connecting

Verify your ElevenLabs API key and voice ID
Check that the voice ID exists in your ElevenLabs account
Look for TTS errors in the log file

Callee can't hear the greeting

Ensure ngrok tunnel is active (the TwiML webhook URL must be reachable)
Check Twilio console for call logs and error codes

Empty transcripts

Check for resource_exhausted errors in the log — this means Scribe is being overloaded
Verify stt.language_code matches the language being spoken

Timezone shows UTC

Install tzdata: pip install tzdata
Set timezone in config.json to your timezone (e.g. "America/New_York")

File Structure

phonecall-mcp/
├── server.py              # MCP server entry point
├── audio_bridge.py        # Bidirectional audio management + state machine
├── call_manager.py        # Call lifecycle (start, listen, respond, end)
├── twilio_handler.py      # Twilio webhooks + WebSocket handler
├── stt.py                 # ElevenLabs Scribe v2 Realtime STT client
├── tts.py                 # ElevenLabs TTS synthesis
├── config.py              # Configuration loader
├── log_setup.py           # Centralized logging
├── models.py              # Data models (CallState, TranscriptEntry)
├── ngrok_launcher.pyw     # Windows ngrok launcher with notifications
├── ngrok_stop.pyw         # Windows ngrok stop utility
├── config.example.json    # Configuration template
├── .env.example           # Environment variables template
├── .gitignore
├── requirements.txt
└── SKILL_EN.md            # Claude skill file for phonecall-mcp

Requirements

Python 3.12+
Twilio account (trial accounts work but play a short message before connecting)
ElevenLabs account with API access
ngrok (free tier with static domain)
Claude Desktop

License

MIT License — see LICENSE for details.

Acknowledgments

Built with Twilio Media Streams, ElevenLabs (TTS + Scribe STT), and the Model Context Protocol.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured