Replicate Anywhere
An MCP server that enables AI assistants to search, discover, and run any model on Replicate.
README
Replicate Anywhere
An MCP (Model Context Protocol) server that enables AI assistants to search, discover, and run any model on Replicate. No hardcoded model lists - just describe what you want and let the AI find and run the right model.
Features
- π Smart Model Search - Find models by fuzzy name matching (e.g., "flux", "stable diffusion", "nano banana pro")
- π€ Automatic Model Discovery - AI searches first, asks questions later
- π Parameter Detection - Automatically retrieves and understands model input schemas
- πΌοΈ Inline Image Display - Image outputs are formatted as markdown for inline display
- β±οΈ Async Prediction Handling - Long-running predictions return status URLs instead of timing out
- β Prediction Status Checking - Check on running predictions that haven't completed yet
Installation
Prerequisites
- Node.js 18+
- A Replicate API token
NPM (Global)
npm install -g replicate-anywhere
From Source
git clone https://github.com/fifthseason-ai/replicate-anywhere.git
cd replicate-anywhere
npm install
npm run build
Configuration
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
REPLICATE_API_TOKEN |
Yes | - | Your Replicate API token |
MAX_POLL_TIME |
No | 300000 |
Maximum time (ms) to wait for predictions before returning async status |
MCP Client Configuration
Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"replicate-anywhere": {
"command": "npx",
"args": ["-y", "replicate-anywhere"],
"env": {
"REPLICATE_API_TOKEN": "r8_your_token_here"
}
}
}
}
LibreChat
Add to your librechat.yaml:
mcpServers:
replicate-anywhere:
type: stdio
command: npx
args:
- -y
- replicate-anywhere
env:
REPLICATE_API_TOKEN: "${REPLICATE_API_TOKEN}"
Docker
services:
replicate-anywhere:
build:
context: ./replicate-anywhere
environment:
REPLICATE_API_TOKEN: ${REPLICATE_API_TOKEN}
Tools
search-models
Search for AI models on Replicate by name or description. This tool is designed to be called first when a user mentions any model name.
{
"query": "flux pro"
}
get-model-info
Get detailed information about a specific model, including its input parameters schema.
{
"owner": "black-forest-labs",
"name": "flux-pro"
}
run-model
Run a prediction on any Replicate model.
{
"model": "black-forest-labs/flux-pro",
"input": {
"prompt": "A beautiful sunset over mountains",
"aspect_ratio": "16:9"
}
}
list-models
List public models on Replicate (paginated).
{
"cursor": "optional_pagination_cursor"
}
check-prediction
Check the status of a running prediction.
{
"prediction_id": "abc123xyz"
}
Usage Examples
Generate an Image
User: "Generate an image of a cat wearing a space helmet using flux"
The AI will:
- Call
search-modelswith query "flux" - Call
get-model-infoto get parameters for the best match - Call
run-modelwith appropriate parameters - Return the image inline (markdown formatted)
Use a Specific Model
User: "Use stable diffusion xl to create a cyberpunk cityscape"
The AI will search for "stable diffusion xl", find stability-ai/sdxl, and run it.
Check a Long-Running Prediction
User: "Check on my prediction abc123"
The AI will call check-prediction to get the current status and output if complete.
Output Formatting
Images
When a model returns image URLs, the output is automatically formatted as markdown:
**Generated Image:**

**Direct link:** https://replicate.delivery/...
Other Outputs
Non-image outputs are returned as JSON.
Development
# Install dependencies
npm install
# Build
npm run build
# Watch mode
npm run dev
# Run locally
REPLICATE_API_TOKEN=your_token npm start
Architecture
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β AI Assistant ββββββΆβ replicate-anywhereββββββΆβ Replicate API β
β (Claude, etc.) βββββββ MCP Server βββββββ β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
The server acts as a bridge between MCP-compatible AI assistants and the Replicate API, providing:
- Tool definitions that guide the AI on how to search and run models
- Smart response formatting for different output types
- Timeout handling for long-running predictions
License
MIT
Contributing
Contributions are welcome! Please open an issue or submit a pull request.
Credits
Built by Aaron Sherrill
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.