Spine

Spine

Exposes a governed, provenance-grounded autonomous delivery pipeline as an MCP server, enabling AI coding assistants like Claude Code or Codex to initiate requirements-to-PR workflows with human approval gates and full audit.

Category
Visit Server

README

Spine

Governed, provenance-grounded autonomous delivery — turn requirements into reviewed, tested pull requests, with a human in control.

Naming. Spine is the product. It's distributed as the agent-orchestrator package and its command is orchestrator — those names stay in install lines and commands throughout the docs.

Spine reads a requirement (from Confluence, Notion, or a Markdown file), understands your target repo, generates code grounded in that repo's own conventions, writes and runs tests, and opens a pull request for you to review. It pauses for your approval before it starts and before anything merges. Nothing is pushed, merged, or written to your tracker unless you say so.

It's built for teams who want agents that are inspectable, reproducible, and safe to run on real code — not demos.

pip install --extra-index-url https://pypi.org/simple/ agent-orchestrator
orchestrator init && orchestrator doctor                  # scaffold .env, check readiness
orchestrator sdlc feature --source file://./spec.md --safe   # build locally — no pushes, no PRs

The published name collides with an unrelated PyPI project — see the Setup & Install guide for the exact install one-liner.


Documentation

Guide Read it for
Setup & Install Installing the CLI, the .env, and standing up the full stack (Temporal + Postgres) for the autonomous pipeline.
User Guide A step-by-step walkthrough: from your first local build to a real PR, local models, the web dashboard, and connecting tools (MCP).
Features & Capabilities The capability catalog — everything Spine can do today, its status, the command/flag to use it, and a link to each deep dive.
Operations & Developer Guide How to operate it: deployment modes, the full environment-variable reference, and standing up each advanced capability — including the semantic spine (ontomesh × infodrift).
Community brief A one-page overview to share — what it does, lifecycle coverage, how to try it, and the feedback we're looking for.

New here? Install → User Guide Steps 1–4. That's the whole everyday workflow in about ten minutes.


Features & capabilities

Requirements → reviewed PR. Point it at a requirements source and a code repo. It extracts a backlog of intents, writes a spec, generates the implementation and tests, gets them green, and opens a PR — with two human gates (before building, before merging). A safe mode builds entirely locally (branch + diff, no external writes) so you can inspect everything first.

Code-grounded understanding. Before generating, it builds a Product Knowledge Graph of your repo — modules, types, functions, call sites, blast radius — and grounds new code in what already exists, so output reads like your team wrote it. Works across Python, Java, and TypeScript. orchestrator understand writes a committed, code-true memory-bank/ your whole team (and any AI tool) can read.

Governed autonomy. The workflow itself is a typed, validated artifact. A planner decomposes the objective, a runtime executes it, and per-edge verifiers check every step against schemas, evidence, and policy. Failures trigger replan, a human approval, or a clean stop. Every tool call, approval, and decision lands in an append-only audit log, and each run is capped by a spend budget.

Learns across runs. Cross-run semantic memory lets the agent recall conventions, pitfalls, and decisions from past runs — each memory cites the run it came from.

You can see inside it. Live OpenTelemetry tracing covers every LLM call, loop step, and tool call, joined to the audit log — so you can debug a run, not just read its result.

Use it your way. A CLI for scripting and CI, a web dashboard (delegate runs, watch them live, approve gates inline), a terminal UI, and MCP in both directions — consume external MCP tools, or expose the whole pipeline as an MCP server to Claude Code, Codex, or your IDE.

Bring your own model. Multi-provider via LiteLLM (Anthropic, OpenAI, Bedrock), or run fully offline on a local model (Ollama). Mix models per stage.

Durable. Long-running pipelines are checkpointed (Temporal + Postgres) — they survive restarts and resume across human approval pauses.


How it works

  requirement (Confluence / Notion / Markdown)
        │
        ▼
   plan ──► validate ──► generate code ──► run tests ──► review ──► open PR
        │        (grounded in your repo's knowledge graph)        │
        └──────────── per-edge verifiers + audit ────────────────┘
                 human gate 1 ▲                    ▲ human gate 2
                 (before build)                    (before merge)
Concept What it is
Planner → GraphIR Turns an objective into a typed, validated execution graph (nodes, edges, budgets, approval points).
Registry Versioned agent templates + tool contracts the planner assembles from.
Runtime LangGraph-based executor with Postgres checkpointing and typed state.
Verifier chain Per-edge schema / confidence / evidence / policy checks that gate every handoff.
Approval gates First-class nodes that pause for human review and resume on your decision.
Audit log Append-only record of every tool call, approval, and policy decision.

FAQ

Does it merge code on its own? No. It opens a PR; a human reviews and merges. There are two approval gates — before building and before merging — and safe mode makes no external writes at all.

Where does my code/data go? To whichever LLM provider you configure — or nowhere external, if you run a local model (Ollama). Generated code stays in a local branch until you choose --live.

Do I need Docker or a database? Not for the everyday path (sdlc feature --safe builds one requirement locally). The autonomous multi-feature pipeline + web dashboard needs Temporal + Postgres — see the Setup guide.

Which languages and models? Code generation and comprehension cover Python, Java, and TypeScript. Any LiteLLM-supported provider (Anthropic, OpenAI, Bedrock) or a local Ollama model; you can set a different model per stage.

How is it safe to run on real repos? Write guards on generated files, allow-listed + write-gated external tools, a per-run spend budget, an append-only audit trail, and human approval before any push or merge.

CLI or web UI? Either — they drive the same engine and the same API. Use the CLI for scripting/CI, the web UI (or terminal UI) for watching runs and approving gates by hand.

Can other tools call it? Yes. It speaks MCP both ways: it can use external MCP servers, and it can run as an MCP server so Claude Code / Codex / your IDE can call the pipeline (with the same gates).


Contributing

Issues and PRs are welcome. See CONTRIBUTING.md, CODE_OF_CONDUCT.md, and SECURITY.md.

License

Apache License 2.0. See LICENSE.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured