Thinking Agent MCP
Enables thinking models to extend their reasoning by outsourcing parts of the chain of thought to a non-thinking model via the chat_agent tool, with configurable parameters.
README
Thinking Agent MCP
Extend your thinking model's chain of thought via MCP tools โ a Model Context Protocol server that exposes chat_agent, create_branch, and get_branch_details tools, enabling thinking models to offload subtasks to non-thinking models and build tree-structured multi-perspective analysis.
Features
- ๐ง Chain of Thought Extension โ Thinking models can delegate reasoning subtasks to non-thinking models via
chat_agent, extending effective reasoning depth beyond single-model token limits - ๐ณ Tree-Structured Thinking โ
create_branchenables recursive, multi-perspective exploration with four branch types (drill down / verify / explore / stash) - ๐ Full Traceability โ
get_branch_detailsretrieves the complete raw reasoning process of any created branch - ๐ก๏ธ Context Isolation โ Tools are stateless and self-contained; all context must be packed into
input_text. No conversation history dependency - ๐๏ธ Parameter Control โ Fine-grained control over tool model output via
temperature,top_p,seed,stop, andmax_tokens - ๐ Dual API Support โ Works with both DeepSeek official API (recommended) and SiliconFlow API
Table of Contents
- Quick Start
- Configuration
- Tools
- Error Handling
- MCP Client Setup
- Testing
- Project Structure
- Development
- License
Quick Start
# Clone and install
git clone https://github.com/ScarletLilith/DeepSeekV4Flash_Thinking_TreeMCP.git
cd meditatorMCP
npm install
# Configure API (see Configuration section below)
# edit test/config.json or set environment variables
# Start the server
npm run build
npm start
# Or run development mode
npm run dev
Configuration
Configuration is loaded with the following priority: Environment variables > test/config.json
Option 1: DeepSeek Official API (Recommended)
export DEEPSEEK_API_KEY=sk-your-key
export DEEPSEEK_BASE_URL=https://api.deepseek.com
export DEEPSEEK_MODEL=deepseek-v4-pro
Note: DeepSeek's thinking mode uses
thinking: {type: "enabled"}(notenable_thinking: true).
Option 2: SiliconFlow API (Fallback)
export SILICONFLOW_API_KEY=sk-your-key
export SILICONFLOW_BASE_URL=https://api.siliconflow.cn/v1
export SILICONFLOW_MODEL=deepseek-ai/DeepSeek-V4-Flash
Config File
Create test/config.json (gitignored automatically):
{
"baseUrl": "https://api.deepseek.com",
"model": "deepseek-v4-pro",
"apiKey": "sk-xxx"
}
Tools
chat_agent
Calls a non-thinking model to execute an independent subtask, extending the thinking model's chain of thought.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
input_text |
string |
required | Complete, self-contained task description with all context |
system_prompt |
string |
optional | System prompt for role/behavior constraints |
temperature |
number |
0.7 |
Sampling temperature (0.0โ2.0). Low = precise, high = creative |
top_p |
number |
0.9 |
Nucleus sampling threshold (0.0โ1.0) |
max_tokens |
number |
4096 |
Maximum output tokens (enforced server-side via API) |
stop |
string[] |
[] |
Stop sequences; empty array = natural completion |
seed |
number |
optional | Random seed for reproducible output (with low temperature) |
Parameter Strategies
Verification: temperature=0.1, top_p=0.1, max_tokens=2048, seed=42
Exploration: temperature=1.2, top_p=0.95, max_tokens=4096
Balanced: temperature=0.5, top_p=0.8, max_tokens=4096
create_branch
Creates a thinking branch node with recursive nesting support for deep multi-perspective analysis.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
session_id |
string |
required | Session ID, consistent within a single reasoning session |
input_text |
string |
required | Self-contained subtask description (โฅ30 characters) |
call_type |
string |
drill_down |
Branch type: drill_down / verify / explore / stash |
parent_node_id |
string |
trunk |
Parent node ID for tree nesting |
Four Branch Types:
| Type | Temperature | Purpose |
|---|---|---|
drill_down |
0.2 | Deep-dive into a subproblem with focused precision |
verify |
0.0 | Verify a conclusion or hypothesis with maximal determinism |
explore |
1.0 | Divergent thinking from different angles with high creativity |
stash |
0.6 | Temporarily record intermediate thoughts for later reference |
Response
{
"status": "success",
"node_id": "n_a1b2c3d4",
"conclusion": "The extracted conclusion text...",
"confidence": 0.85,
"remaining_quota": 12,
"suggestions": [
"ๅๆฃๆข็ดขๅฎๆ๏ผๅฏๅฏนๆไปทๅผ็ๆนๅ็จ drill_down ๆทฑๅ
ฅ",
"่ฟๅฏๅๅปบ 12 ไธชๅๆฏ๏ผๅปบ่ฎฎ็ปง็ปญๅค่งๅบฆๆข็ดข"
]
}
get_branch_details
Retrieves the complete raw reasoning process of a previously created branch node.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
session_id |
string |
required | Session ID |
node_id |
string |
required | Branch node ID returned by create_branch |
Response
{
"status": "success",
"node_id": "n_a1b2c3d4",
"raw_process": "The complete raw reasoning output from the model..."
}
Error Handling
Tools return structured errors with type and action fields for the thinking model to make informed decisions:
{
"success": false,
"type": "api",
"action": "report",
"error": "API authentication failed (401)",
"status_code": 401
}
| Error Type | action |
Trigger |
|---|---|---|
network |
retry |
DNS resolution failure, connection refused |
api |
backoff |
429 rate limited |
api |
report |
401 authentication failure |
api |
retry |
5xx server errors |
validation |
fix_input |
Empty input_text |
config |
report |
Missing API Key / Model configuration |
The server-side retry mechanism uses exponential backoff with jitter (1sโ3sโ7s, max 3 retries) for 429 and 5xx errors. Network errors (ENOTFOUND, ECONNREFUSED, ECONNRESET) are not automatically retried.
MCP Client Setup
Claude Desktop
{
"mcpServers": {
"thinking-agent": {
"command": "node",
"args": ["path/to/meditatorMCP/dist/index.js"],
"env": {
"DEEPSEEK_API_KEY": "sk-your-key",
"DEEPSEEK_BASE_URL": "https://api.deepseek.com",
"DEEPSEEK_MODEL": "deepseek-v4-pro"
}
}
}
}
Any MCP-compatible Client
Configure stdio transport to point to node dist/index.js in the project directory, with the required environment variables set.
Testing
The project includes both interactive and automated test frameworks:
# Interactive CLI (with tools mode)
npm run test:with-tool
# Interactive CLI (pure thinking, no tools)
npm run test:without-tool
# Automated comparison test (runs both scenarios + generates report)
npm run test:comparison
# Batch end-to-end tests
npm run test:batch
Test Scripts
| Script | Description |
|---|---|
test/testFramework.ts |
Interactive CLI test framework |
test/comparisonTest.ts |
Automated A/B comparison (with-tool vs without-tool) |
test/runA.js |
Scenario A: thinking model + tools (standalone, DeepSeek) |
test/runB.js |
Scenario B: pure thinking model (standalone, DeepSeek) |
test/batchTest.ts |
Batch end-to-end tests |
Scoring: Each question is evaluated against 10 objective checkpoints (50 total). Evaluation is done by human reviewers, not automated scripts.
Note: The test/config.json file contains your API key and is automatically gitignored.
Project Structure
โโโ src/
โ โโโ index.ts # MCP Server entry point
โ โโโ chatAgentTool.ts # Tool implementations (chat_agent, create_branch, get_branch_details)
โ โโโ gatekeeper.ts # Input validation and quota enforcement
โ โโโ strategyEngine.ts # Parameter strategy mapping (call_type โ temperature/top_p)
โ โโโ nodeStore.ts # Branch node storage and conclusion extraction
โ โโโ schemas.ts # Zod validation schemas and TypeScript types
โ โโโ logger.ts # Structured logging to stderr
โ โโโ polyfill.ts # Node 14 fetch polyfill
โโโ test/
โ โโโ comparisonTest.ts # A/B comparison test
โ โโโ testFramework.ts # Interactive CLI test framework
โ โโโ batchTest.ts # Batch testing
โ โโโ runA.js # Scenario A test (DeepSeek)
โ โโโ runB.js # Scenario B test (DeepSeek)
โ โโโ config.json # API configuration (gitignored)
โโโ .env.example # Environment variable template
โโโ blueprint.md # Project design blueprint (Chinese)
โโโ package.json
โโโ tsconfig.json
โโโ README.md
Development
# Build TypeScript
npm run build
# Start production server
npm run start
# Development mode (ts-node, no build step)
npm run dev
Design Philosophy
- Self-Contained Task Descriptions โ All context must be packed into
input_text; tools never rely on conversation history - Context Isolation โ Each tool call is stateless and independent, preventing context explosion in the main chain
- Token Cost Optimization โ Context is consumed by the cheaper non-thinking model's input tokens, not the thinking model's output tokens
- Tree-Structured Reasoning โ Complex problems are decomposed into independent branches, each analyzed separately, then synthesized
Benchmark: MCP Tools Impact on Output Quality
We conducted a controlled experiment comparing 3 approaches across 5 challenging engineering problems (distributed consensus, service mesh, RTOS kernel, columnar storage engine, multi-modal AI agent framework).
Test Groups
| Group | Model | API | Tools |
|---|---|---|---|
| A | GLM-5.2 | SiliconFlow | None |
| B | DeepSeek-V4-Flash | DeepSeek Official | chat_agent + create_branch |
| C | DeepSeek-V4-Flash | DeepSeek Official | None |
Key Results
| Metric | A (GLM-5.2) | B (DS + Tools) | C (DS Pure) |
|---|---|---|---|
| Total Output | 38,888 chars | 169,484 chars ๐ | 57,565 chars |
| Total Time | 705s | 1,516s | 239s ๐ |
| Total Tokens | 28,459 | 210,482 | 29,184 |
| Total Cost | ยฅ0.75 | ยฅ0.21 | ยฅ0.06 ๐ |
| Avg Output/Question | 7,778 chars | 33,897 chars (4.4x) ๐ | 11,513 chars |
| Tool Calls | 0 | 30 ๐ | 0 |
| Cache Hit Rate | 0% | up to 68% ๐ | 0% |
What We Found
- With MCP tools, DeepSeek-V4-Flash produced 4.4x more detailed engineering solutions โ including complete Go-style Raft consensus implementations, assembly-level RTOS scheduler code, and production-ready service mesh configurations
- Tree-structured thinking (
create_branch) enabled the model to explore 5-6 levels deep on complex problems, creating subtrees for architecture, implementation, testing, and verification - Cost comparison: GLM-5.2 costs 13x more than DeepSeek-V4-Flash pure thinking mode (ยฅ0.75 vs ยฅ0.06) for comparable output quality. With tools enabled, DeepSeek-V4-Flash cost increased to ยฅ0.21 due to deeper exploration (3.7x more tokens), but remained 3.6x cheaper than GLM-5.2.
- Cache hit rates reached 68% during multi-round tool calls, dramatically reducing effective input costs via DeepSeek's prefix caching
Full experiment results and data:
results/comparison/report.md
License
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.