nexus-agents

nexus-agents

nexus-agents makes your AI coding tools work together intelligently. It coordinates Claude, Codex, Gemini, and OpenCode — routing each task to the best model using data-driven algorithms, validating outputs through multi-model consensus voting, and continuously improving through outcome-driven learning. Connect it to any MCP-compatible editor (Claude Code, Cursor, VS Code) and it handles the rest.

Category
Visit Server

README

Nexus Agents

OpenSSF Best Practices OpenSSF Scorecard

Governance substrate for your AI coding agents — adversarial review, drift-detected rules, immutable audit, closed-loop telemetry

npm version License: MIT Node.js Version


Why Nexus Agents?

Nexus-agents is a governance layer that sits above your AI coding agents — Claude Code, Codex, Gemini, and OpenCode. The agents do the engineering; nexus-agents enforces the rules they have to follow, reviews their work adversarially before it ships, audits everything they touch, and routes the next task based on what actually worked.

What it gives you:

  • Adversarial PR reviewpr_review runs 5 voter roles (architect, security, devex, catfish, scope_steward) with a 4-point verification gate. On the v5 evaluation set (10 PRs): 100% bug-catch and 50% raw false-positive rate; manual triage reclassified most "FPs" as legitimate findings the dataset had mislabeled. Full numbers: docs/research/pr-review-experiment-results-v5.md
  • Drift-detected charterCLAUDE.md + governance:check + blocking CI gates fail the build when documented rules drift from registered behavior (model registry, MCP tools, expert types, skills)
  • Immutable audit trail — every tool call, every voter decision, every routing choice flows through AuditTrail with structured logging and hash-chained append-only storage; integrity is verifiable via the verify_audit_chain MCP tool
  • Closed-loop routingOutcomeStore feeds production telemetry back into LinUCB + TOPSIS scoring so the system actually learns from what shipped vs what regressed
  • Multi-voter consensusconsensus_vote runs a default 7-role panel (architect, security, devex, ai_ml, pm, catfish, scope_steward; --quick uses 3). Six strategies: simple/super-majority, unanimous, higher-order Bayesian, opinion-wise, proof-of-learning
You:               "Review this PR / orchestrate this task / vote on this proposal"
                    ↓
nexus-agents:       enforce rules → route → adversarial review → audit → learn from outcome
                    ↓
Engineering agents: Claude Code · Codex · Gemini · OpenCode
                    ↓
Code:               actual edits, tests, PRs, issues

What this is NOT:

  • Not another autonomous coding agent. OpenHands, SWE-agent, AutoGen, Devin, Factory — those are agents. Nexus-agents is the layer above them. Use whichever agents fit; we govern them
  • Not a chat framework. Nothing here orchestrates conversations. It orchestrates real CLI tool invocations with real file I/O and outcome tracking
  • Not a model API proxy. The value is the rules, the gates, the audit, and the learning. Routing is a side effect of the governance work, not the product

Where nexus-agents sits in your stack

   Human / IDE / CLI
   (Claude Code, Cursor, VS Code, terminal)
            │ MCP Protocol
            ▼
  ┌─────────────────────────────────────────────────────┐
  │  GOVERNANCE SUBSTRATE — what nexus-agents provides   │
  │                                                       │
  │   Charter (drift-checked)   Adversarial PR review    │
  │   Role registry             Multi-voter consensus    │
  │   Immutable audit trail     Closed-loop telemetry    │
  │                                                       │
  │   38 MCP tools · multi-stage CompositeRouter         │
  └────────────────────────┬────────────────────────────┘
                           │
                           ▼ delegates execution to
  ┌─────────────────────────────────────────────────────┐
  │  ENGINEERING AGENTS — what does the actual work      │
  │                                                       │
  │   Claude Code · Codex · Gemini · OpenCode            │
  └────────────────────────┬────────────────────────────┘
                           │
                           ▼ produces
                   Code, tests, PRs, issues

The governance substrate is the layer that catches the mistakes engineering agents would otherwise make — bad code shipped, rules drifting from intent, audit gaps, telemetry-free routing — and routes the next task based on what actually worked the last time.


Quick Start (2 minutes)

1. Install

npm install -g nexus-agents

Or as a Claude Code plugin (single-command install from the official marketplace):

/plugin install nexus-agents

See docs/getting-started/PLUGIN_INSTALL.md for plugin-specific setup, or llms-install.md for the short install guide an AI agent can follow.

2. Verify

nexus-agents doctor

Prints a health table — Node version, configured CLIs (claude / codex / gemini / opencode), API keys missing vs present. Read-only; safe to run any time.

3. See what success looks like (60-second smoke task — no API keys needed)

nexus-agents vote --quick --proposal "Use SQLite over JSON files for the outcome store"

You should see:

Nexus Agents Consensus Vote
============================

Collecting votes from 3 agents (timeout: 60s each)...

Proposal: Use SQLite over JSON files for the outcome store

Votes

  ✓ Software Architect: APPROVE (86%)
  ✓ Security Engineer:  APPROVE (74%)
  ✓ Scope Steward:      APPROVE (91%)

Summary

  Approve:  3
  Reject:   0
  Abstain:  0
  Approval: 100.0%
  Threshold: simple_majority

Result: APPROVED

Completed in ~30s

Three voter roles deliberate via whichever local CLIs you have (Claude, Codex, Gemini) — no API keys required. Per-voter reasoning is recorded; the terminal prints the verdict. Mixed outcomes (some approve / some reject) and graceful error handling are demonstrated on the project site hero with a real 7-voter run.

4. Wire into your editor

nexus-agents setup   # Auto-configures MCP server in Claude Code, Cursor, etc.

Restart your editor. The 38 MCP tools (orchestrate, consensus_vote, research_synthesize, verify_audit_chain, …) become available to whatever agent you're already using.

What setup configures

By default, setup writes/updates up to seven things in your environment. Each can be skipped with the corresponding --skip-* flag if you don't want it.

Configured Where written Opt-out flag
MCP server registration (Claude) ~/.claude/mcp.json / Claude Desktop config --skip-mcp
Project rules .cursor/rules/ and/or .claude/rules/ --skip-rules
Session hooks ~/.claude/hooks/ (session-start / pre-tool / etc.) --skip-hooks
OpenCode MCP config ~/.config/opencode/opencode.json --skip-opencode
Gemini MCP config ~/.gemini/mcp.json --skip-gemini
Codex MCP config ~/.codex/config.toml --skip-codex
Project config file ./nexus-agents.yaml --skip-config

Run with --interactive (the default) for a per-step confirm flow, or --no-interactive to accept all defaults.

5. Standalone usage (no editor required)

export ANTHROPIC_API_KEY=your-key
nexus-agents orchestrate "Explain the architecture of this codebase"

Security: In default MCP mode, the server communicates only via stdio with the parent process (no network exposure). The REST API (opt-in) auto-generates an API key on first start. For network-exposed deployments, set NEXUS_AUTH_ENABLED=true. See SECURITY.md.


Capabilities

Category Details
Adversarial PR Review pr_review MCP tool: 5 voter roles (architect, security, devex, catfish, scope_steward) with 4-point gate. v5 evaluation (n=10): 100% bug-catch, 50% raw FP rate; manual triage reclassified most FPs as legitimate findings (details)
Consensus Voting 6 strategies: simple_majority, supermajority, unanimous, higher_order (Bayesian correlation-aware), opinion_wise, proof_of_learning
Drift-Detected Charter CLAUDE.md + inject-governance.ts check enforces single-source registries (model registry, MCP tools, expert types). Blocking CI gate fails build on drift
Audit Trail Structured logging for every tool call, voter decision, and routing choice. Hash-chained immutable storage; integrity verifiable via verify_audit_chain MCP tool (#2281, #2289)
Closed-Loop Telemetry OutcomeStore records production results; LinUCB bandit + TOPSIS scoring + adaptive routing bonuses adjust based on what actually worked. No other framework closes this loop
Security Pipeline Sandboxing (Docker/policy), trust-tiered input handling, SARIF parsing, red-team patterns, ClawGuard access policies (audit/enforce)
Multi-Expert Orchestration 11 built-in expert types coordinated by Orchestrator. Roles bind prompt + tools + memory
Development Pipeline Research → Plan → Vote → Decompose → Implement → QA → Security. Three modes: autonomous, harness (caller implements), dry-run
Memory & Learning 5 user-facing backends (session, belief, agentic, adaptive, typed). Cross-session persistence feeds routing decisions
Research System 9 discovery sources (arXiv, GitHub, Semantic Scholar, etc). Auto-catalog, quality scoring, synthesis into topic clusters
Graph Workflows DAG-based workflow execution with checkpoint/resume, state reduction, and event hooks
38 MCP Tools Agent management, workflow execution, research, memory, codebase intelligence, repo analysis, consensus, operations

Available Experts

Expert Specialization
Code Implementation, debugging, optimization
Architecture System design, patterns, scalability
Security Vulnerability analysis, secure coding
Testing Test strategies, coverage, test generation
QA Acceptance criteria, regression checks
Documentation Technical writing, API docs
DevOps CI/CD, deployment, infrastructure
Research Literature review, state-of-the-art analysis
PM Product management, requirements, priorities
UX User experience, usability, accessibility
Infrastructure Server management, bare metal, networking
Data Viz Charts, dashboards, visual data presentation

Supported CLIs & Providers

Nexus-agents routes tasks through 5 CLI adapters, each connecting to major AI providers:

CLI Provider Best For
claude Anthropic (Claude) Complex reasoning, analysis
gemini Google (Gemini) Long context, multimodal
codex OpenAI (Codex CLI) Code generation, reasoning
codex-mcp OpenAI (Codex MCP) MCP-native Codex integration
opencode Custom OpenAI-compat Custom endpoints, local models

CLI Commands

nexus-agents                    # Start MCP server (default)
nexus-agents doctor             # Check installation health
nexus-agents setup              # Configure Claude CLI integration
nexus-agents orchestrate "..."  # Run task with experts
nexus-agents vote "proposal"    # Multi-agent consensus voting
nexus-agents review <pr-url>    # Review a GitHub PR
nexus-agents expert list        # List available experts
nexus-agents workflow list      # List workflow templates
nexus-agents config init        # Generate config file
nexus-agents init --portable    # Create workspace-local .nexus-agents/ for sandboxes
nexus-agents init --portable --mcp-config  # Also emit .mcp.json wiring Claude Code to it
nexus-agents init --portable --install --mcp-config  # …and install the binary into the workspace
nexus-agents fitness-audit      # Run fitness score audit
nexus-agents research query     # Query research registry
nexus-agents --help             # Full command list

See docs/ENTRYPOINTS.md for the complete CLI reference (28+ commands).


MCP Tools

When running as an MCP server, the following tools are available:

<!-- Auto-generated below — do not hand-edit between markers. Run pnpm governance:inject to update from packages/nexus-agents/src/mcp/tools/index.ts. See #2269. -->

<!-- GOVERNANCE:README_TOOLS:START -->

Tool Description
orchestrate Task orchestration with Orchestrator coordination
create_expert Create a specialized expert agent
execute_expert Run a task through a previously-created expert (by expertId)
run_workflow Run a linear workflow template (use run_graph_workflow for DAGs)
delegate_to_model Pick the best-fit existing model for a task (no registry change)
list_experts Inventory of expert ROLES for create_expert
list_workflows Inventory of multi-step TEMPLATES for run_workflow
consensus_vote Multi-model consensus voting on proposals
research_query Query research registry (status, overlap, stats, search)
research_add Add an arXiv PAPER to the registry (for non-paper sources use research_add_source)
research_add_source Add a NON-PAPER source (repo/tool/blog) — for arXiv papers use research_add
research_discover Discover papers/repos from external sources
research_analyze Analyze registry for gaps, trends, coverage
research_catalog_review Review auto-cataloged research references
research_synthesize Synthesize registry into topic clusters with themes
survey_oss_landscape Transient OSS project search (license, stars, last-commit) via GitHub
vendor_publishing_audit Look up a vendor's signing infrastructure (GPG keys, URL patterns, signature shape)
compare_data_feeds Diff two YAML/JSON feeds: coverage + per-field axes
memory_query Query across all memory backends
memory_stats Memory system statistics dashboard
memory_write Write to typed memory backends
weather_report Multi-CLI performance weather report
issue_triage Triage GitHub issues with trust classification
run_graph_workflow Run a DAG workflow with checkpoint + rollback (linear → run_workflow)
execute_spec Execute AI software factory spec pipeline
registry_import Draft YAML for a NEW model entry (for picking existing models use delegate_to_model)
query_trace Query execution traces for observability
query_task_state Query the structured task-state log for a task ID
verify_audit_chain Verify hash chain of a FileAuditStorage audit log directory
repo_analyze Analyze GitHub repository structure
repo_security_plan Generate security scanning pipeline for a repo
extract_symbols Tree-sitter AST symbols from a SINGLE file (functions/classes/types)
search_codebase Cross-file ripgrep search for patterns or text (not an AST parser)
run_dev_pipeline Full dev pipeline: research, plan, vote, implement, QA
run_pipeline Execute a pipeline plugin by name with typed input
pr_review Multi-voter PR review with verification gate (experimental)
supply_chain_tradeoff_panel Per-axis tradeoff vote for build-vs-buy / supply-chain decisions
improvement_review Threshold-gated observability loop — surfaces routing/tech-debt/bug/security signals from outcome+fitness data; files candidate issues

<!-- GOVERNANCE:README_TOOLS:END -->


Configuration

Environment Variables:

Variable Description
ANTHROPIC_API_KEY Claude API key
OPENAI_API_KEY OpenAI API key
GOOGLE_AI_API_KEY Gemini API key
NEXUS_LOG_LEVEL Log level (debug/info/warn/error)

Generate config file:

nexus-agents config init   # Creates nexus-agents.yaml

Documentation

Topic Link
Full CLI Reference docs/ENTRYPOINTS.md
Architecture docs/architecture/README.md
Contributing CONTRIBUTING.md
Coding Standards CODING_STANDARDS.md
Quick Start Guide QUICK_START.md

Development

git clone https://github.com/nexus-substrate/nexus-agents.git
cd nexus-agents
pnpm install
pnpm build
pnpm test

Requirements: Node.js 22.x LTS, pnpm 9.x


Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feat/amazing-feature)
  3. Commit with conventional commits (feat(scope): add feature)
  4. Open a Pull Request

See CONTRIBUTING.md for details.


License

MIT - See LICENSE


Built with Claude Code

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured