Router-MCP

Router-MCP

Routes natural language queries to appropriate MCP tools and planners with high-precision semantic matching and safety guardrails. Supports multi-intent detection, task planning, and strict role-based filtering to prevent misexecution or unauthorized access.

Category
Visit Server

README

Router-MCP 高精度路由识别 MVP

这是一个围绕“准确率优先、宁可拒答也不误触发”构建的最小可运行 Router-MCP 路由识别服务。它不是通用聊天机器人,而是一个针对结构化路由层,负责把用户口语化输入安全、可解释地路由到 MCP / Skill / Tool / Planner。

当前实现

  • agent-router-mcp:统一入口、路由编排、输出结构化决策。
  • semantic-router:配置驱动的候选召回与多字段打分,默认优先调用本地 embedding 服务并自动回退规则召回。
  • NeMo-Guardrails 风格守护层:对执行前决策做二次守护,控制 clarify / refuse 质量,拦截高风险误执行。
  • 多意图拆分:一句话可拆成多个子任务分别决策。
  • planning intent:识别“方案设计 / 任务拆解 / 步骤规划 / 工作流编排”并路由到 plan
  • 作用域控制:支持 tenant / customer / role / enabled 过滤。
  • 可解释 trace:每个子意图保留标准化、召回、过滤、打分、决策审计信息。
  • local_embedding_service:独立目录部署的本地 embedding 服务,主路由服务通过 HTTP 调用。
  • API / CLI / MCP Server / OpenClaw 薄包装 / 评测脚本 / 单测:可直接本地演示。

目录结构

config/
  capabilities.yaml          # 能力注册表示例
  router_settings.yaml       # 阈值、别名、反问模板、打分配置
  eval_dataset.yaml          # 最小评测集
src/router_mcp/
  agent_router_mcp/          # 路由编排服务
  semantic_router/           # 候选召回与语义打分
  guardrails/                # 执行前守护与反问/拒答规范
  pipeline/                  # normalize / slot / rerank / decision
  registry/                  # registry schema 与加载校验
  eval/                      # 批量评测脚本
  api.py                     # FastAPI demo API
  cli.py                     # CLI demo
tests/
  test_router_pipeline.py
  test_api.py

快速开始

python3 -m pip install -e '.[dev]'
pytest
python3 -m pip install -r local_embedding_service/requirements.txt
python3 -m local_embedding_service.app
python3 -m router_mcp.eval.run_eval
python3 -m router_mcp.eval.run_eval --dataset config/eval_dataset.yaml --dataset config/eval_dataset_extended.yaml
python3 -m router_mcp.eval.run_eval --dataset "/Users/zhuhaihua/Downloads/%E9%9D%92%E5%B2%9B%E6%B8%AF_%E9%AB%98%E8%B4%A8%E9%87%8FQuery%E6%B3%9B%E5%8C%96_v2.xlsx" --summary-only
python3 -m router_mcp.cli "帮我查下今天异常流程,再把昨天没跑完的补跑一下" --tenant qingdao_port --customer default --role supervisor --execute
python3 -m router_mcp.cli "先别执行,先规划一下这个需求怎么落地" --tenant qingdao_port --customer default --role supervisor --plan
python3 -m router_mcp.app
python3 -m router_mcp.mcp_server

服务启动后可用接口:

  • GET /capabilities
  • POST /capabilities/validate
  • POST /route
  • POST /route/explain
  • POST /route/plan
  • POST /route/batch

API 示例

POST /route

{
  "text": "帮我查下今天异常流程,再把昨天没跑完的补跑一下",
  "context": {
    "tenant_id": "qingdao_port",
    "customer_scope": "default",
    "role": "supervisor",
    "allow_execute": true
  },
  "dry_run": true
}

返回重点字段:

  • overall_decision
  • decisions[].decision
  • decisions[].trace_id
  • decisions[].reason
  • decisions[].evidence
  • decisions[].goal_type
  • decisions[].risk_level
  • decisions[].selected_capability
  • decisions[].decision_reason
  • decisions[].matched_capabilities
  • decisions[].confidence_breakdown
  • decisions[].missing_slots
  • decisions[].clarify_question
  • decisions[].refuse_reason
  • decisions[].execution_target
  • decisions[].audit_trace

核心决策原则

  • 高置信单命中才执行。
  • 多候选接近时优先 clarify
  • 缺关键槽位时必须 clarify
  • 方案设计 / 任务拆解 / 步骤规划 / 多阶段编排时进入 plan
  • 未命中或低置信度时 refuse
  • 高风险执行缺少确认或证据不足时强制拦截。
  • customer / tenant / role 不匹配时直接过滤,不允许越权命中。

最小评测集覆盖

  • 明确命中样例
  • 模糊表达样例
  • 多意图样例
  • 未命中样例
  • 高相似能力混淆样例
  • 权限不足样例
  • 缺槽位样例

评测输出至少包含:

  • correct_decisions
  • correct_capabilities
  • top1_accuracy
  • topk_recall
  • clarify_count
  • refuse_count
  • execute_count
  • false_execute_count
  • wrong_route_rate
  • direct_execution_rate
  • clarification_precision
  • refusal_precision
  • planning_detection_precision
  • planning_detection_recall

其中 false_execute_count 是当前 MVP 最关键指标。

扩展评测集位于 eval_dataset_extended.yaml,主要覆盖相似能力混淆、越权命中、已知+未知混合多意图、查询/导出歧义等高风险场景。

最新调优结果

本轮调优重点放在 4 件事:

  • 补强中文口语槽位抽取:最近三天 / 本班 / 上一班 / 按货种汇总 / excel / 发给值班负责人 / 录到目标系统
  • 支持 查 -> 整理 -> 发出 的结构化步骤识别,并给同 family 多步骤链增加 rerank 组合分
  • 修正 family 级别 guardrail:query/generate family 可以按 family 安全放行,execute family 仍必须按 resolved member 槽位复核;hard_confusion 强制 clarify
  • 补齐 plan / refuse:显式 planning intent 检测、planner capability、MCP/OpenClaw 薄包装和 planning 指标

在真实评测集 /Users/zhuhaihua/Downloads/%E9%9D%92%E5%B2%9B%E6%B8%AF_%E9%AB%98%E8%B4%A8%E9%87%8FQuery%E6%B3%9B%E5%8C%96_v2.xlsx 上,当前 summary-only 指标为:

  • decision_accuracy: 0.75
  • execute_count: 66
  • false_execute_count: 0
  • wrong_route_count: 0
  • named_target_accuracy: 0.9667
  • trace_coverage: 1.0
  • reason_coverage: 1.0

重点分组结果:

  • exact_hit: accuracy=0.6667, actual_execute=20, false_execute=0
  • multi_intent: accuracy=0.6333, actual_execute=19, false_execute=0
  • exception_focus: accuracy=0.5, actual_execute=15, false_execute=0
  • hard_confusion: accuracy=1.0, actual_execute=0, false_execute=0

当前 YAML 最小评测结果:

  • config/eval_dataset.yaml: decision_accuracy=1.0, wrong_route_rate=0.0, direct_execution_rate=0.4444, planning_detection_precision=1.0, planning_detection_recall=1.0
  • config/eval_dataset_extended.yaml: decision_accuracy=0.9, wrong_route_rate=0.0, direct_execution_rate=0.3, planning_detection_precision=1.0, planning_detection_recall=1.0

MCP / OpenClaw

  • MCP Server: src/router_mcp/mcp_server.py
  • MCP tools: route_queryclarify_queryvalidate_routelist_capabilitiesexplain_routeplan_task
  • OpenClaw 薄包装: src/router_mcp/openclaw_skill/bridge.py
  • OpenClaw 调用入口: route_for_openclaw(...)

待办

  • 用通用槽位抽取层逐步替换当前 rules + keywords 方案:基于 capability.required_slots + slot schema + examples 做受约束抽取,而不是继续穷举关键词。
  • 引入模型化 reranker,优先考虑中文友好的 cross-encoder / reranker,用于同 family 成员重排、查询/导出/执行细分和 hard negative 区分。
  • 让 LLM 只负责“复杂 query 分解 + schema 约束下的槽位抽取/解释”,execute / clarify / reject 仍然走结构化决策层,避免黑盒直执行业务。
  • 建立持续评测闭环,沉淀真实 query、clarify 补充信息、最终正确 capability、误执行样例,逐步把规则 MVP 过渡到 embedding recall + model rerank + constrained extraction + structured decision

关于 semantic-router / NeMo-Guardrails

为了让仓库开箱可跑、测试稳定,这个 MVP 采用了两个“薄适配层”:

  • semantic_router.engine.LocalSemanticRouter:默认优先调用独立本地 embedding 服务做语义召回,并和确定性多字段匹配融合;当 embedding 服务不可用时自动回退到纯规则召回。
  • guardrails.policy.GuardrailPolicy:默认用显式规则落地 NeMo-Guardrails 风格守护,可在后续接入真实 guardrails runtime。

这样可以先把“安全、可解释、可评测”的基础设施层跑起来,再逐步替换底层模型。

本地 Embedding 服务

独立服务位于 local_embedding_service/,默认监听 http://127.0.0.1:8001,默认模型为 BAAI/bge-small-zh-v1.5

启动命令:

python3 -m pip install -r local_embedding_service/requirements.txt
python3 -m local_embedding_service.app

可选环境变量:

  • EMBEDDING_MODEL_NAME:自定义本地模型名称
  • EMBEDDING_DEVICE:强制设备,例如 cpumpscuda
  • EMBEDDING_HOST:监听地址
  • EMBEDDING_PORT:监听端口

主服务的 embedding 调用配置在 config/router_settings.yamlembedding 段中:

  • enabled:是否启用 embedding 召回
  • service_url:独立 embedding 服务地址
  • timeout_seconds:请求超时
  • similarity_weight / lexical_weight:embedding 分和规则分的融合权重

500 条原始种子接入

仓库会自动读取 config/qingdaogang_500cap.json 并映射到统一 registry schema:

  • capability_name -> name
  • aliases -> aliases
  • capability_description -> description
  • generalized_user_query -> examples
  • required_slots -> required_slots
  • action_type -> 统一 action_type
  • clarify_when / reject_when -> guardrail hints

这样当前服务默认会同时加载手工高精度样例能力和 500 条原始种子能力。

开发约定与最新报告

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured