rocq-piler
MCP server that bridges LLMs with the Rocq (Coq) proof assistant via LSP, enabling interactive theorem proving with AI.
README
π€ rocq-piler

Let rocq-piler do the heavy lifting for your proofs.
Overview
This project demonstrates how a Language Server Protocol (LSP) interface to the Rocq proof assistant (formerly Coq) enables neurosymbolic programming: the seamless integration of neural networks (LLMs) with symbolic reasoning systems (proof assistants).
What is Neurosymbolic Programming?
Neurosymbolic programming combines:
- Neural/Statistical AI (LLMs): Pattern recognition, natural language understanding, heuristic search
- Symbolic AI (Proof assistants): Logical reasoning, formal verification, guaranteed correctness
This approach leverages the complementary strengths of both paradigms:
| Neural (LLM) | Symbolic (Rocq) |
|---|---|
| Intuition & pattern matching | Rigorous logical deduction |
| Generates candidate solutions | Verifies correctness |
| Probabilistic & approximate | Deterministic & exact |
| Learns from examples | Reasons from axioms |
| Fast but fallible | Slow but infallible |
How LSP Enables Neurosymbolic Rocq Programming
The Traditional Problem
Historically, interacting with proof assistants required:
- Deep knowledge of proof tactics and syntax
- Understanding of internal proof state representations
- Manual, tedious proof construction
- Limited automation capabilities
This created a barrier between AI systems (which excel at pattern matching and generation) and proof assistants (which excel at verification).
The LSP Solution
The Language Server Protocol provides a standardized, machine-friendly interface that:
- Exposes Proof State: Query current goals, hypotheses, and proof context at any position
- Enables Interactive Editing: Apply tactics and immediately observe their effects
- Supports Speculative Execution: Try tactics without committing to file edits (PΓ©tanque API)
- Provides Structured Feedback: Parse errors, type information, and verification results
This transforms the proof assistant into a verification oracle that an LLM can query interactively.
In this implementation, the bridge is also project-aware: it can detect Coq/Rocq project roots, reopen documents under the correct workspace, and keep rocq-lsp aligned with _CoqProject, _RocqProject, or dune-project layouts.
The Neurosymbolic Loop
βββββββββββββββ
β LLM β β Reads: proof goals, context, error messages
β (Neural) β β Generates: candidate tactics, proof strategies
ββββββββ¬βββββββ
β
β (proposes tactics)
ββββββββββββββββ
β LSP Server β β Bridges neural and symbolic systems
β (Interface) β β Translates between LLM and Rocq
ββββββββ¬ββββββββ
β
β (executes & verifies)
ββββββββββββββββ
β Rocq β β Verifies: type-checks, proves correctness
β (Symbolic) β β Returns: success/failure, updated goals
ββββββββββββββββ
Workflow:
- LLM reads proof goal via LSP (
coq_open_goals) - LLM generates candidate tactic(s) based on patterns learned from training
- Tactics are applied via LSP (
coq_insert_tactic) - Rocq verifies the tactic is type-correct and logically sound
- If successful: proof progresses (new subgoals or QED)
- If failed: error message guides LLM to try alternatives
- Loop continues until proof is complete
Why This Is Powerful
This architecture achieves:
Guided Generation: The LLM doesn't need to be "correct" β only creative. Rocq acts as a discriminator that filters out invalid proofs.
Incremental Verification: Each step is verified immediately, preventing cascading errors.
Explainable AI: Every LLM decision is justified by a formal proof that can be audited by humans.
Learning from Mistakes: Structured error feedback allows the LLM to refine its search strategy.
Safe Automation: The LLM can explore aggressively because Rocq guarantees soundness.
Key Capabilities Enabled by LSP
1. Goal-Directed Search
The LLM can:
- Query the current proof goal: "Prove that
forall n, n + 0 = n" - Understand the hypothesis context: "Given:
n : nat" - Generate relevant tactics: "I'll try induction on
n"
LSP Tool: coq_open_goals β returns structured goal representation
2. Speculative Execution
The LLM can:
- Try multiple tactics without modifying the file
- Quickly discard failures
- Only commit tactics that successfully reduce the goal
LSP Tools: coq_get_state_at_pos, coq_run_tactic, coq_goals_for_state (PΓ©tanque API)
3. Interactive Proof Refinement
The LLM can:
- Incrementally build proofs tactic-by-tactic
- Respond to verification failures by adjusting strategy
- Learn which patterns work in different contexts
LSP Tool: coq_insert_tactic β applies tactic and returns updated goals
4. Error-Driven Learning
The LLM can:
- Receive structured error messages from Rocq
- Understand type mismatches, unification failures, etc.
- Adjust tactics based on specific error feedback
LSP Feature: All tools return structured error information
The Hash-Driven Workflow
rocq-piler uses a content-addressed admit system: every open goal in a proof has a unique hash computed from its goal text. This turns proof construction into a clean cycle:
focus_proof β get hashes β insert_tactic(admit_hash=X) β repeat until Qed
Example: Proving (True /\ True) /\ True in 3 calls
1. focus_proof(file="Thm.v", name="deep_conj")
β 1 admit: abc123 L7: (True / True) / True
2. insert_tactic(admit_hash="abc123", tactic="split.\n- admit.\n- admit.")
β 2 admits remaining:
def456 L8: True / True (hyps: empty, goal: True / True)
ghi789 L9: True (hyps: empty, goal: True)
3. insert_tactic(admit_hash="def456", tactic="split; exact I.")
insert_tactic(admit_hash="ghi789", tactic="exact I.")
β Qed applied
Why hashes? Every admit with the same goal text shares the same hash. Close four True admits at once β across any bullet depth β in a single call. The LLM reads hashes from focus_proof, writes tactics for them, and counts down to Qed.
Example: Dependent Type Progress Theorem in 9 calls
See examples/dep_vec.v for a full dependently-typed language (Nat + Vec n) with both preservation and progress theorems proved entirely via hash-driven MCP calls:
1. focus_proof β hash for the theorem goal
2. insert_tactic(admit_hash=X, tactic="{induction + 7 stubs}") β 7 case hashes with hyps+goal inline
3-9. insert_tactic(admit_hash=<case>, tactic="...") Γ 7 β Qed
Each response shows hypotheses + goal per admit, so the LLM writes each case's tactic without extra focus_proof calls.
MCP Tools
| Tool | Description |
|---|---|
focus_proof |
One-stop shop: proof state, script, all open admits with hashes + hyps + goals |
insert_tactic |
Insert a tactic; pass admit_hash to target specific admits by hash; multi-line blocks supported |
try_step |
Speculatively execute a tactic without modifying the file |
open_goals |
Quick goal query with compact/Prev mode options |
proof_state |
Full proof context including name, statements, goal state |
undo_step |
Roll back the last N edit operations |
reset_proof |
Wipe a proof body to fresh Admitted. |
add_lemma / delete_lemma |
Insert/remove lemma stubs |
delete_step |
Remove the last tactic line from a proof |
edit_file |
Find-and-replace text edits with LSP re-sync |
check_file |
Force full document checking |
check_range |
Get diagnostics for a specific line range |
search_lemmas / inspect_term / inspect_about / locate_term / require_lib |
Knowledge & search |
snap_state / exec_tactic / state_goals |
Low-level PΓ©tanque API |
Current Server Capabilities
- Content-addressed admits: every open goal gets a hash β same goal text = same hash, close all matching admits at once across any bullet depth
- Hypotheses + goal per admit:
focus_proofandinsert_tacticresponses include full hypothesis context for each open admit - Multi-line stub insertion: open a proof, insert N bulleted admits, and get N hashes back β all in one
insert_tacticcall - Re-seal with multi-goal expansion: when a tactic produces N > 1 goals, they are optionally expanded to N individually addressable bulleted admits
- Auto-Qed: proof automatically closed when all tactic admits and focused goals are eliminated; guarded against premature firing when background goals remain
- Dynamic workspace switching: automates restarting
rocq-lspunder different project roots - Speculative execution: try tactics safely via
try_stepor PΓ©tanque state APIs before committing - Bullet-aware proof navigation:
focus_proofreports bullet stack depth, given-up counts, and sibling context - Safe undo/reset: tracks edit operations, supports multi-step undo
Benefits of This Approach
For AI/LLM Developers
- Structured Interface: No need to parse Rocq syntax or output
- Immediate Feedback: Know instantly if a tactic succeeds
- Safe Exploration: Can't generate unsound proofs
- Reduced Search Space: Type system eliminates invalid candidates
For Proof Engineers
- AI Assistance: Let LLMs handle tedious proof details
- Guaranteed Soundness: All proofs are formally verified
- Interactive Refinement: Guide AI when it gets stuck
- Explainable Results: Every proof step is auditable
For Researchers
- Benchmark Platform: Test AI proof capabilities
- Formal ML: Train models on verified code
- Safe Code Generation: Generate formally verified programs
- Hybrid Intelligence: Study human-AI-proof collaboration
Real-World Applications
1. Automated Theorem Proving
LLMs can tackle routine lemmas while experts focus on complex proofs.
2. Formally Verified Software
Generate code with machine-checked correctness proofs (e.g., crypto, compilers).
3. Mathematical Formalization
Convert natural language math into Rocq, assisted by LLMs.
4. Education
Interactive tutoring systems that teach both intuition (LLM) and rigor (Rocq).
5. AI Safety Research
Build verifiably safe AI systems with formal guarantees.
Getting Started
Prerequisites
# Install Rocq and rocq-lsp
opam install coq-lsp # rocq-lsp coming soon
# Verify installation
coq-lsp --version # or rocq-lsp --version
Installation
cd rocq-piler
npm install
npm run build
Usage with OpenCode
Add to your MCP settings:
{
"mcpServers": {
"rocq-piler": {
"command": "node",
"args": [
"/path/to/rocq-piler/dist/index.js",
"--workspace-root",
"/path/to/your/rocq/project"
]
}
}
}
The server can start with one workspace root and later switch automatically when a tool call targets a file inside a different Rocq project.
Then in OpenCode:
You: "Help me prove that addition is commutative in Rocq"
OpenCode: [Uses coq_open_goals to see the goal]
OpenCode: [Uses coq_insert_tactic to try "induction n"]
OpenCode: [Iteratively builds the proof using LSP feedback]
OpenCode: "Proof complete! Here's what I did..."
Real Proof Example: Dependent Type System
See examples/dep_vec.v for a complete, working example of a simple dependently-typed language:
Language: Nat + Vec n β base type Nat, dependent Vec n of length n
Theorems proved via MCP tools:
- β
Preservation (7 cases):
has_type t T β step t t' β has_type t' Tβ proved in 9 MCP calls - β
Progress (7 cases):
has_type t T β value t β¨ β t', step t t'β proved in 9 MCP calls
Both proofs use the same pattern: focus_proof β multi-line stub β case hashes with hyps+goal β close by hash one-by-one.
A larger PCF+References benchmark is in benchmarks/complete/pcf_ref.v:
- β 9 helper lemmas + preservation theorem (21 cases)
- β Full heap semantics with store extension, weakening, and substitution
- β
All cases closed with
Qed - β Total cost: 4 cents ($0.0351 actual, 50 API calls, DeepSeek V4)
Benchmarks
| Benchmark | Theorem | Cases | Lemmas | API Calls | Cost |
|---|---|---|---|---|---|
Dep Vec (dep_vec.v) |
Preservation + Progress | 7+7 | 0 | 18 | β |
PCF + Ref (pcf_ref.v) |
Preservation | 21 | 9 | 50 | $0.04 |
PCF + References (Type Preservation)
A PCF language extended with mutable references (allocation, dereference, assignment), heap semantics, and store typing. The preservation theorem establishes that well-typed programs remain well-typed after a step, with the store typing possibly extended.
Auxiliary lemmas proved: extends_refl, extends_trans, nth_error_extends, store_weakening, shift_typing, subst_typing, heap_ok_extends, heap_lookup_has_type, heap_ok_update
Source: benchmarks/incomplete/pcf_ref.v (challenge) β benchmarks/complete/pcf_ref.v (solution)
Additional Examples
examples/dep_vec.vβ Dependent type system (Nat + Vec n) with preservation and progress theorems, proved hash-drivenexamples/test_issues.vβ PCF + References (21-case preservation theorem)examples/example.vβ Simple examples for each MCP tool
The Future of Neurosymbolic Programming
This LSP-based architecture represents a paradigm shift:
From: "AI generates code, humans verify"
To: "AI and proof assistants collaborate in real-time"
From: "LLMs are black boxes"
To: "Every LLM decision has a formal proof"
From: "Formal verification is for experts"
To: "LLMs make formal methods accessible"
What's Already Possible
As demonstrated by the proofs in this repository:
- β
Hash-driven proof construction:
focus_proofβ get hashes βinsert_tactic(admit_hash=X)β repeat until Qed - β Multi-case induction proofs: 7 cases closed in 9 total MCP calls with full hypothesis context per case
- β
Multi-bullet depth: same hash works across
-/+/*bullet levels β close all matching admits in one call - β Dependent type progress + preservation: Nat + Vec n language fully verified
- β
Interactive refinement: speculative execution via
try_stepwith immediate feedback
What's Coming
As LLMs improve and proof assistants expose richer APIs:
- Automated formalization of mathematical papers
- AI-assisted discovery of new theorems
- Verified AI systems with formal safety guarantees
- Natural language to formal proof translation
- Large-scale formal verification of real-world software
Learn More
- Specification: See
MCP_COQ_LSP_SPEC.md - Implementation: See
docs/IMPLEMENTATION.md - Quick Start: See
docs/QUICKSTART.md - Project config detection: See
docs/PROJECT-CONFIG-DETECTION.md - Rocq Documentation: https://rocq-prover.org/
- rocq-lsp Project: https://github.com/ejgallego/coq-lsp
Contributing
This is a research prototype demonstrating neurosymbolic programming principles. Contributions welcome:
- Add support for other proof assistants (Lean, Isabelle)
- Improve tactic suggestion heuristics
- Build benchmark suites for AI proof capabilities
- Develop training datasets from verified code
License
MIT License - See LICENSE file for details
Built with: TypeScript, rocq-lsp, Model Context Protocol
Design: Content-addressed admits, hash-driven workflow, full hypothesis context per goal
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.