Gemini MCP
Remote MCP server that exposes Google Gemini's text, image, video (Veo), and audio transcription capabilities as tools any MCP client can call directly.
README
Gemini MCP
Remote MCP server that exposes Google Gemini's text, image, video (Veo), and audio transcription capabilities as tools any MCP client (Claude.ai, Claude Code, Cowork) can call directly. Same architecture as the Bosta and Salestrail MCP connectors: Node.js + Express, deployed as a Render web service, talking to Gemini's REST API.
Tools
| Tool | What it does |
|---|---|
generate_text |
Text generation — captions, copy, summaries, translation, analysis. Optional system_instruction. |
generate_image |
Generate an image from a prompt, or edit/compose an existing one via reference_image_base64 (Nano Banana / Gemini image model). |
start_video_generation |
Kicks off a Veo video job from a prompt (+ optional reference image for image-to-video). Returns an operation_name — generation takes ~1-6 min. |
check_video_status |
Polls an operation_name from start_video_generation. Returns {"status":"pending"} or the finished video as an embedded base64 resource. |
transcribe_audio |
Transcribes audio (m4a/mp3/wav/aac/ogg/flac) from base64, with an optional instruction (e.g. "transcribe in Egyptian Arabic"). |
All tools return MCP error results (isError: true) on failure rather than crashing
the server — confirmed against the live Gemini API during build (an invalid key
correctly came back as a tool error, not a connection drop).
Known limitations / things to verify with a real key
- Audio inline size limit: requests with inline base64 data top out around ~20MB total. Longer call recordings will need the Gemini File API (upload first, then reference by URI) — not implemented yet. Worth adding if your typical .m4a files are long.
.m4aMIME type: the tool defaults toaudio/mp4for.m4afiles, since that's the container format (AAC inside MP4). This has not been verified against a real audio file yet. If Gemini rejects it, tryaudio/aacinstead — pass it explicitly via themime_typeargument.- Veo response shape:
check_video_statustries a couple of known response shapes (generateVideoResponse.generatedSamples[0].videoandpredictions[0]). Worth confirming against a real job once you have Veo access, since Google has changed this shape across API versions before. - Veo access: Veo 3 may require allowlisting / billing on your Gemini API key —
check the Gemini API console if
start_video_generationerrors out.
Local setup
npm install
cp .env.example .env # fill in GEMINI_API_KEY
npm start
Health check: curl http://localhost:3000/health
Deploying to Render
- Push this folder to a GitHub repo.
- Render dashboard → New → Web Service → connect the repo.
- Build command:
npm install - Start command:
npm start - Add environment variable
GEMINI_API_KEY(and optionally the model overrides from.env.example) in Render's Environment tab. - Once deployed, your MCP endpoint is
https://<your-app>.onrender.com/mcp.
Connecting in Claude
Add it as a custom connector pointing at https://<your-app>.onrender.com/mcp,
the same way Bosta MCP / Salestrail MCP are connected.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.