minimax-llm-mcp
Exposes the MiniMax M3 LLM API to MCP-compatible clients, enabling chat completions, text completions, tool calls, and token counting via stdio or SSE transport.
README
<div align="center">
minimax-llm-mcp
An MCP server that exposes the MiniMax M3 LLM API to MCP-compatible clients.
MCP server that exposes the MiniMax M3 LLM API to MCP-compatible clients over stdio and SSE.
</div>
Overview
minimax-llm-mcp is a Model Context Protocol (MCP) server that exposes the MiniMax M3 LLM API to any MCP-compatible client — Claude Desktop, Cursor, CyOps, Windsurf, and others. The server speaks JSON-RPC over stdio (the default, suitable for child-process clients) and HTTP + Server-Sent Events (SSE, for browser- and network-based clients), and registers four tools that map cleanly onto the upstream's chat-completions and tool-use surface:
| Tool | Purpose |
|---|---|
minimax_chat |
Non-streaming chat completion. |
minimax_complete |
Single-turn text completion (prompt + optional system). |
minimax_tool_call |
M3-native tool-use passthrough. Forwards tools and tool_choice verbatim. |
minimax_count_tokens |
Local token count using cl100k_base (no upstream call). |
The MiniMax M3 endpoint is OpenAI-compatible; the server wraps a small, well-tested HTTP client that handles auth, timeouts, retry-on-429, error mapping, and request-secret redaction.
Status:
0.1.0— the binary, the four tools, and the stdio + SSE transports are wired up. The SSE transport is feature-complete but not exercised by the demo at this time.
Features
- MCP-native — registers four tools with Zod-validated input schemas, conforming to the MCP spec.
- Two transports — stdio (default) for child-process clients, and HTTP+SSE for network clients.
- OpenAI-compatible — non-streaming and streaming chat completions, plus native tool-use passthrough.
- Local token counting —
minimax_count_tokensruns entirely client-side viagpt-tokenizer'scl100k_baseencoding; no upstream call, deterministic, fast. - Production-grade HTTP —
Authorization: Bearer …on every request, configurable per-request timeout, one-shot retry on429with exponential backoff, and full HTTP-status →McpErrormapping (401/403 →AuthenticationRequired, 429 →RateLimited, 5xx →UpstreamError, other 4xx →InvalidRequest). - Secret redaction — error messages are scrubbed of
sk-…API-key shapes before they leave the server. - TypeScript-native — strict ES2022 / NodeNext /
tsup-bundled CJS with declarations on the wire. - Tested —
vitestwithv8coverage; 80%+ line coverage on the runtime modules.
Installation
From npm (recommended)
npm install -g minimax-llm-mcp
This installs the minimax-llm-mcp binary on your PATH, ready for any MCP client to spawn.
From a local checkout
git clone https://github.com/your-org/minimax-llm-mcp.git
cd minimax-llm-mcp
npm install
npm run build
The compiled binary is then at ./dist/index.js. Point your MCP client at it directly (see Usage below).
Prerequisites
- Node.js ≥ 18 (the
enginesfield enforces this). - A MiniMax API key. Sign up at the MiniMax developer portal and copy the bearer token from your dashboard.
Configuration
The server reads its configuration from environment variables at startup. The schema is validated by Zod; missing or invalid values produce a ConfigError and exit 1 on stdio, or 500 on SSE.
| Variable | Required | Default | Description |
|---|---|---|---|
MINIMAX_API_KEY |
yes | (none) | Bearer token for the MiniMax M3 LLM API. |
TRANSPORT |
no | stdio |
Transport the server listens on. One of stdio or sse. |
REQUEST_TIMEOUT_MS |
no | 300000 |
Per-request timeout when calling the upstream API (in milliseconds). |
RETRY_ON_429 |
no | true |
Whether to retry once on 429 Too Many Requests with a short back-off. |
MINIMAX_EMBEDDING_ENABLED |
no | false |
Reserved for a future minimax_embed tool (out of scope in 0.1.0). |
The full set is also documented in .env.example — copy that file to .env and uncomment the lines you want to override:
cp .env.example .env
$EDITOR .env
Usage
The server is consumed by an MCP client. Below are copy-pasteable configuration snippets for the four most common clients. Replace <your-minimax-api-key> with a real bearer token, or set MINIMAX_API_KEY in the client's environment.
Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"minimax-llm-mcp": {
"command": "npx",
"args": ["-y", "minimax-llm-mcp"],
"env": {
"MINIMAX_API_KEY": "<your-minimax-api-key>"
}
}
}
}
Or, if you have a local build:
{
"mcpServers": {
"minimax-llm-mcp": {
"command": "node",
"args": ["/absolute/path/to/minimax-llm-mcp/dist/index.js"],
"env": {
"MINIMAX_API_KEY": "<your-minimax-api-key>"
}
}
}
}
Cursor
Edit ~/.cursor/mcp.json (or use Settings → MCP → Add new global MCP server):
{
"mcpServers": {
"minimax-llm-mcp": {
"command": "npx",
"args": ["-y", "minimax-llm-mcp"],
"env": {
"MINIMAX_API_KEY": "<your-minimax-api-key>"
}
}
}
}
CyOps
CyOps reads MCP servers from its global config (~/.cyops/mcp.json or the in-app Settings → MCP panel):
{
"mcpServers": {
"minimax-llm-mcp": {
"command": "npx",
"args": ["-y", "minimax-llm-mcp"],
"env": {
"MINIMAX_API_KEY": "<your-minimax-api-key>"
}
}
}
}
Windsurf
Edit ~/.codeium/windsurf/mcp_config.json (or the in-app Settings → Cascade → MCP Servers → Add server form):
{
"mcpServers": {
"minimax-llm-mcp": {
"command": "npx",
"args": ["-y", "minimax-llm-mcp"],
"env": {
"MINIMAX_API_KEY": "<your-minimax-api-key>"
}
}
}
}
Trying it without an MCP client
For a quick smoke test (no real API call required):
# In one terminal, run the server in stdio mode and pipe a JSON-RPC
# `tools/list` request through it:
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' \
| MINIMAX_API_KEY=demo-key npx minimax-llm-mcp
The server reads the request from stdin, dispatches it, and writes the JSON-RPC response to stdout. You should see the four tool names listed.
Available Tools
Every tool's input is validated by a Zod schema; the SDK applies the schema before the handler runs.
minimax_chat
Non-streaming chat completion. Returns the assistant content plus optional usage.
Input:
| Field | Type | Required | Notes |
|---|---|---|---|
model |
string | no (default: MiniMax-M3) |
|
messages |
array | yes | At least one message. Each has role (system/user/assistant/tool/function), content, and optional name / tool_call_id. |
temperature |
number | no | [0, 2]. |
top_p |
number | no | [0, 1]. |
n |
integer | no | Number of completions. |
max_tokens |
integer | no | ≤ 1,000,000 (hard cap). |
stop |
string | string[] | no | |
presence_penalty |
number | no | [-2, 2]. |
frequency_penalty |
number | no | [-2, 2]. |
user |
string | no | Upstream abuse-tracking identifier. |
Output: { content, finish_reason, model, usage? }.
minimax_complete
Single-turn text completion. Wraps prompt (plus optional system) into a one-message conversation.
Input:
| Field | Type | Required | Notes |
|---|---|---|---|
model |
string | no (default: MiniMax-M3) |
|
prompt |
string | yes | Non-empty. |
system |
string | no | System message prepended before prompt. |
| (rest) | same as minimax_chat |
Output: { content, finish_reason, model, usage? }.
minimax_tool_call
M3-native tool-use passthrough. Forwards tools and tool_choice to the upstream verbatim — the server does not validate or transform the function definitions.
Input:
| Field | Type | Required | Notes |
|---|---|---|---|
(same as minimax_chat) |
|||
tools |
array | yes | Non-empty. Each entry is the OpenAI tool object (e.g. { type: "function", function: { name, description, parameters } }). |
tool_choice |
string | object | no | Standard OpenAI forms: "auto", "none", "required", or {"type": "function", "function": {"name": "..."}}. |
Output: { content, finish_reason, model, usage?, tool_calls?, tool_call_payload? }. When finish_reason === "tool_calls", tool_call_payload is a structured JSON block with the call list — it is the JSON-serialized text content the MCP client renders.
minimax_count_tokens
Local token count using the cl100k_base BPE encoding (the same one OpenAI's tiktoken uses for GPT-3.5/4). Does not make an upstream call — entirely client-side.
Input:
| Field | Type | Required | Notes |
|---|---|---|---|
model |
string | no (default: MiniMax-M3) |
Recorded in the result, not used for tokenization. |
messages |
array | yes | At least one message. |
Output: { total, model, encoding, per_message: [{ role, tokens }] }. The counts are deterministic and match gpt-tokenizer's cl100k_base encoding.
Development
Setup
git clone https://github.com/your-org/minimax-llm-mcp.git
cd minimax-llm-mcp
npm install
Scripts
| Script | What it does |
|---|---|
npm run build |
Bundle src/index.ts to dist/ via tsup (CJS + .d.ts + sourcemap). |
npm run dev |
Same as build but with --watch. |
npm run typecheck |
tsc --noEmit against tsconfig.json. |
npm test |
Run the vitest suite once. |
npm run test:watch |
vitest --watch. |
npm run coverage |
vitest run --coverage (v8 provider; writes HTML to coverage/). |
Project layout
src/
├── index.ts # CLI entry point (stdio)
├── server.ts # MCP server: registers the four tools
├── client.ts # MiniMax M3 HTTP client (auth, retry, error mapping)
├── config.ts # Zod-validated env config
├── errors.ts # McpError factory + ErrorCategory
├── tools/
│ ├── chat.ts # minimax_chat
│ ├── complete.ts # minimax_complete
│ ├── tool-call.ts # minimax_tool_call
│ └── count-tokens.ts # minimax_count_tokens
└── transports/
├── stdio.ts # startStdioServer(config)
└── sse.ts # startSSEServer(config, options)
tests/ # Mirror of src/, plus a top-level suite
# for the server, the HTTP client, the
# SSE transport, and the error helpers.
TDD workflow
The slices were added in this order: scaffold → config → errors → client → count-tokens → chat → complete → tool-call → server → stdio → SSE. Each slice added the source file(s), the matching tests/.../*.test.ts, and was verified with npm test + npm run coverage before moving on. When adding a new tool or transport, follow the same pattern: write a failing test, write the engine, run the suite.
Adding a new tool
- Create
src/tools/<name>.tswith a Zod input schema, anX_INPUT_SCHEMAexport, and ahandleX(client, input, signal?)function. The handler returns a typed result object; the SDK wraps it in{ content: [{ type: "text", text: ... }] }. - Create
tests/tools/<name>.test.tswithvi.fn()-based client stubs. - Register the tool in
src/server.tsviaserver.registerTool(name, { description, inputSchema: X_INPUT_SCHEMA.shape }, async (args) => { ... }).
Adding a new env var
- Add a Zod schema entry to
CONFIG_SCHEMAinsrc/config.ts(with a default if optional). - Add the uncommented placeholder to
.env.example. - Add tests in
tests/config.test.tscovering the validation paths.
Publishing
The package is npm publish-ready out of the box (the bin entry, files whitelist of ["dist"], engines, main, types, and license are all wired up). A pre-publish checklist:
- Bump
versioninpackage.json. npm run typecheck— clean.npm test— 100% green; coverage ≥ 80% onsrc/.npm run build—dist/index.jshas the shebang and is executable.npm pack— inspect the tarball. Thepackagefield should include onlydist/,package.json, andREADME.md.npm publish --dry-run— confirm the publish plan.npm login(one-time).npm publish— tag withlatestfor production releases.
The pre-publish step in CI should also npm install in a clean checkout and npm test to catch any drift between the test environment and the publish artifact.
License
MIT — Copyright (c) 2026 minimax-llm-mcp contributors.
See LICENSE for the full text.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.