Vigentic MCP

Vigentic MCP

A Model Context Protocol server that utilizes a headless Neovim instance as an IDE kernel for advanced code editing and navigation. It provides tools for LSP-powered diagnostics, buffer management, and structural edits using Tree-sitter.

Category
Visit Server

README

Vigentic

Small proof-of-concept for using a headless Neovim instance as an IDE kernel behind a tool server.

What the POC proves

  • One agent can control one headless Neovim instance through a narrow tool surface.
  • Neovim owns buffers, diagnostics, LSP, and Tree-sitter.
  • A Node bridge can open a TypeScript file, observe diagnostics, edit through Neovim, and run tests.
  • The same bridge can be exposed to Codex as an MCP server.

Repository layout

agent/
  loop.ts
sample-workspace/
  src/user.ts
  src/user.test.ts
tools/
  diagnostics.ts
  editing.ts
  execution.ts
  mcp-server.ts
  navigation.ts
  nvim.ts
  reset-demo.ts
  server.ts
  workspace.ts
init.lua

Local setup

  1. Install dependencies:

    pnpm install
    
  2. Start Neovim headless:

    pnpm run start:nvim
    

    This manual helper keeps using the fixed ./tmp/nvim-agent.sock path.

  3. In another shell, start either tool surface:

    JSON-line server for direct local testing:

    pnpm run dev:server
    

    MCP stdio server for Codex:

    pnpm run dev:mcp
    
  4. Optional: scaffold repo-local agent guidance for Vigentic MCP:

    pnpm run setup:agents
    

    Agents can do the same thing through the native MCP tool setup_agents.

POC demo flow

If you are teaching agents how to use this repo's MCP surface, start with docs/agent-usage.md. The rest of this section is the longer reference and demo walkthrough.

Preferred agent loop:

  1. if the repo should carry local Vigentic guidance, call setupAgents once first
  2. discover with findFiles or searchText
  3. open or read with openFile or getBuffer
  4. choose the narrowest edit primitive that fits
  5. for a simple current-buffer edit, use useLatestChangedTick:true; for an explicit path or already-open bufferId, use useLatestSessionChangedTick:true; otherwise chain expectedChangedTick from the most recent nextChangedTick
  6. prefer applyEditBatch for several dependent edits in one same-buffer chain
  7. prefer applyMultiFileBatch when several explicit targets should preflight and checkpoint together
  8. inspect repo-scoped state with workspaceStatus when the session spans multiple files
  9. persist with saveBuffer or saveAllBuffers when the batch did not checkpoint for you
  10. verify with getDiagnostics and verifyWorkspace

Read-only MCP resources now sit alongside the tool surface. Prefer resources for fast discovery and reusable context, and keep mutations and command execution on tools.

Available fixed resources:

  • vigentic://workspace/status
  • vigentic://workspace/files
  • vigentic://buffers/open
  • vigentic://repo/agents
  • vigentic://workspace/diagnostics

Available resource templates:

  • vigentic://file/{path}
  • vigentic://diagnostics/{path}
  • vigentic://definition/{path}/{line}/{character}
  • vigentic://references/{path}/{line}/{character}

Strict-read note:

  • resources do not open, switch, save, or edit buffers
  • vigentic://workspace/files is the preferred first-pass discovery snapshot before findFiles
  • file reads can fall back to disk when the file is not already open
  • diagnostics and navigation templates work best on already-open buffers and can return structured unavailable metadata when no live buffer/LSP context exists

Reset the sample to a failing TypeScript state:

pnpm run demo:reset

Exercise the Neovim edit loop with the direct tool bridge:

printf '%s\n' '{"id":1,"method":"findFiles","params":{"root":"sample-workspace","pattern":"**/*.ts","limit":20}}' | pnpm exec tsx tools/server.ts

findFiles now defaults to mode:"project", which hides .git internals unless you explicitly request mode:"raw".

Open the target file and wait for diagnostics:

{ printf '%s\n' '{"id":2,"method":"openFile","params":{"path":"sample-workspace/src/user.ts","reloadFromDisk":true}}'; sleep 3; printf '%s\n' '{"id":3,"method":"getDiagnostics"}'; } | pnpm exec tsx tools/server.ts

openFile, getBuffer, reloadBuffer, and discardBufferChanges return a snapshot with: path, bufferId, changedTick, lineCount, and lines. saveBuffer now returns save metadata with path, bufferId, changedTick, and lineCount, and only includes lines when responseMode is omitted or set to "full". When you pass startLine and/or endLine to getBuffer, the response keeps the total lineCount, returns only the requested lines, and adds readRange.

Mutating edits now return one canonical apply result: applied, diff, matchedRanges, summary, nextChangedTick, and optional snapshot. For simple current-buffer edits, you can omit expectedChangedTick and pass useLatestChangedTick:true instead. For explicit path targets and already-open explicit bufferId targets, you can omit expectedChangedTick and pass useLatestSessionChangedTick:true instead. Agents should chain expectedChangedTick from the most recent mutation result whenever they are not using one of the convenience modes. Set includeSnapshot:false when you only need diff and nextChangedTick. Set responseMode:"compact" or responseMode:"summary" on saveBuffer and saveAllBuffers when you only need save metadata and want quieter responses for large files. The additive summary object includes the tool name, target path, bufferId, preview/apply mode, matched-range count, line stats, and a short excerpt so observer-facing UIs can describe the change without expanding full JSON payloads.

useLatestChangedTick:true is a current-buffer-only convenience mode. Do not combine it with expectedChangedTick, and do not use it with explicit path or bufferId targeting. useLatestSessionChangedTick:true is the explicit-target convenience mode. Use it for explicit path targets or already-open bufferId targets, and do not combine it with expectedChangedTick or useLatestChangedTick:true.

Single-buffer happy path:

  1. openFile
  2. for a simple current-buffer edit, call the mutation with useLatestChangedTick:true
  3. saveBuffer
  4. if the edit chain becomes dependent or multi-step, switch to applyEditBatch

formatBuffer and organizeImports use the same mutation result shape, support useLatestChangedTick:true for current-buffer edits, otherwise require expectedChangedTick, and leave buffers dirty until an explicit saveBuffer or saveAllBuffers checkpoint.

Quick current-buffer convenience edit:

cat <<'EOF' | pnpm exec tsx tools/server.ts
{"id":4,"method":"replaceExact","params":{"oldText":"return name;","newText":"return { id: 1, name };","useLatestChangedTick":true}}
EOF

Preview a semantic edit. Replace <changedTick-from-openFile> with the snapshot value returned by openFile:

cat <<'EOF' | pnpm exec tsx tools/server.ts
{"id":4,"method":"replaceExact","params":{"oldText":"return name;","newText":"return { id: 1, name };","expectedChangedTick":<changedTick-from-openFile>,"previewOnly":true}}
EOF

Apply the exact replacement and save the active buffer:

cat <<'EOF' | pnpm exec tsx tools/server.ts
{"id":5,"method":"replaceExact","params":{"oldText":"return name;","newText":"return { id: 1, name };","expectedChangedTick":<changedTick-from-openFile>}}
{"id":6,"method":"saveBuffer","params":{}}
EOF

For observer-friendly sessions, checkpoint-save after logical edit chunks so git diff and other disk-backed views update while work progresses. Do not save after every primitive edit if the buffer is intentionally in an intermediate broken state.

For longer sessions, prefer explicit buffer targeting instead of relying on mutable editor focus. getBuffer and replaceBuffer now accept either path or bufferId and restore the previously active buffer after the targeted operation:

cat <<'EOF' | pnpm exec tsx tools/server.ts
{"id":7,"method":"getBuffer","params":{"path":"sample-workspace/src/user.ts"}}
{"id":8,"method":"replaceBuffer","params":{"path":"sample-workspace/src/user.ts","text":"export const value = 1;","expectedChangedTick":<changedTick-from-id-7>,"includeSnapshot":false}}
EOF

Fast same-file multi-edit loop:

cat <<'EOF' | pnpm exec tsx tools/server.ts
{"id":9,"method":"applyEditBatch","params":{"expectedChangedTick":<changedTick-from-openFile>,"edits":[{"kind":"replace_exact","oldText":"return name;","newText":"return { id: 1, name };"},{"kind":"replace_lines","startLine":0,"endLine":0,"lines":["export type User = {","  id: number;","  name: string;","};"]}]}}
EOF

applyEditBatch.edits must be an array of raw edit objects. Each edits[i] entry must match one of these shapes: {"kind":"replace_exact","oldText":"...","newText":"..."} {"kind":"replace_lines","startLine":0,"endLine":0,"lines":["..."]} {"kind":"apply_hunk","hunk":"@@ -1,1 +1,1 @@\n-old\n+new"} Do not pass strings or JSON-stringified edit snippets in edits[]. If a client-generated schema view ever renders edits as string[], treat that as stale or misrendered metadata. The actual contract is object entries, and the server will reject stringified edits with expected object, received string. One valid payload object is: { expectedChangedTick: 7, edits: [{ kind: "replace_exact", oldText: "return name;", newText: "return { id: 1, name };" }, { kind: "replace_lines", startLine: 0, endLine: 0, lines: ["export const value = 1;"] }] }

Use applyEditBatch when several dependent edits belong to one same-buffer chain and you want one initial expectedChangedTick check. The server now preflights the full edit chain before the first mutation, which avoids partially applying earlier edits when a later batch item is invalid. If the edit is a simple current-buffer mutation, useLatestChangedTick:true is the lightest path. If the edits are independent or you need to inspect intermediate state, keep using single edit calls and chain each later expectedChangedTick from the latest nextChangedTick.

For explicit multi-file work, applyMultiFileBatch preflights every target before any mutation, applies the targets sequentially, and can checkpoint only the touched buffers in the same call:

cat <<'EOF' | pnpm exec tsx tools/server.ts
{"id":10,"method":"applyMultiFileBatch","params":{"mode":"apply","targets":[{"path":"sample-workspace/src/user.ts","expectedChangedTick":<changedTick-from-id-7>,"edits":[{"kind":"replace_exact","oldText":"return name;","newText":"return { id: 1, name };"}]},{"bufferId":12,"expectedChangedTick":4,"edits":[{"kind":"replace_lines","startLine":0,"endLine":0,"lines":["export const value = 1;"]}]}],"checkpoint":{"save":true,"responseMode":"compact"}}}
EOF

For empty or new repos, discovery should be resilient. findFiles now returns an empty files list when the root is valid but no files exist yet. When that happens, scaffold the repo into a live buffer session instead of falling back to a one-shot rewrite:

cat <<'EOF' | pnpm exec tsx tools/server.ts
{"id":11,"method":"createDirectory","params":{"path":"sample-workspace/src/generated","recursive":true}}
{"id":12,"method":"createFiles","params":{"files":[{"path":"sample-workspace/src/generated/example.ts","text":"export const value = 1;\n"},{"path":"sample-workspace/src/generated/index.ts","text":"export * from \"./example\";\n"}]}}
{"id":13,"method":"openFile","params":{"path":"sample-workspace/src/generated/example.ts","reloadFromDisk":true}}
EOF

If you are bootstrapping nested directories and files in one step, createFiles also accepts createParents:true to recursively create missing parent directories for the batch before writing.

searchText is intentionally disk-backed. Unsaved Neovim edits are not included until you save the buffer, so save before searchText when observers or downstream tools need the latest on-disk state.

When replaceExact is the wrong primitive, switch intentionally:

cat <<'EOF' | pnpm exec tsx tools/server.ts
{"id":14,"method":"replaceLines","params":{"startLine":0,"endLine":0,"lines":["export const value = 1;"],"expectedChangedTick":<changedTick-from-openFile>,"previewOnly":true}}
EOF

For full rewrites, replaceBuffer still exists as a fallback, supports previewOnly:true, and returns the same chainable result shape:

printf '%s\n' '{"id":15,"method":"replaceBuffer","params":{"text":"export const value = 1;","expectedChangedTick":<changedTick-from-openFile>}}' | pnpm exec tsx tools/server.ts

All mutating edit calls still enforce strict stale-tick protection. If a mutation fails because the buffer moved, the error reports the requested target, active buffer path and bufferId, expected tick, actual tick, and a compact Recovery JSON: block with the suggested next action.

You can feed that failure back into suggestEditRetry to get a concrete next-step recommendation:

cat <<'EOF' | pnpm exec tsx tools/server.ts
{"id":16,"method":"suggestEditRetry","params":{"tool":"replaceExact","failureMessage":"replace_exact could not find the requested snippet in sample-workspace/src/user.ts. Recovery JSON: {\"code\":\"replace_exact_snippet_not_found\"}","arguments":{"path":"sample-workspace/src/user.ts","oldText":"alpha\nbeta\n","newText":"omega\ntheta\n"},"currentBuffer":{"path":"sample-workspace/src/user.ts","changedTick":12,"lines":["alpha","beta","gamma"]}}}
EOF

When the failed exact match maps cleanly onto whole lines in the fresh buffer, suggestEditRetry will typically recommend replaceLines with previewOnly:true.

Canonical stale-tick retry:

  1. call getBuffer for the same target
  2. take the returned changedTick
  3. retry the mutation with that expectedChangedTick

Recommended loop:

  1. discover with findFiles or searchText
  2. scaffold with createDirectory + createFile(openAfterCreate=true) if discovery returns no files or createFiles + openFile when you need to bootstrap several files at once
  3. openFile
  4. capture bufferId or use explicit path for non-current work
  5. edit with replaceExact, replaceLines, applyHunk, or replaceBuffer
  6. use useLatestChangedTick:true only for simple current-buffer edits; otherwise chain expectedChangedTick from the previous edit result's nextChangedTick
  7. checkpoint-save with saveBuffer or saveAllBuffers
  8. verify with getDiagnostics and verifyWorkspace

Edit primitive choice:

Tool Use when
replaceExact one precise snippet match is stable and easy to identify
replaceLines the change is line-oriented or newline-sensitive
applyHunk surrounding context matters more than an exact snippet
replaceBuffer a full rewrite is simpler than a narrow semantic edit
formatBuffer the target file should be normalized by the configured formatter or LSP formatting
organizeImports a TypeScript or JavaScript buffer needs import cleanup without a broader refactor

Buffer lifecycle:

  • getBuffer reads the current buffer by default, or a targeted path / bufferId when you need parallel-safe reads.
  • getBuffer also accepts startLine / endLine for partial reads when full-buffer snapshots would be wasteful.
  • Prefer the narrowest reliable operation. When several approaches are equally safe, favor targeted reads, incremental edits, compact diagnostics, and omit snapshots unless they change the decision.
  • replaceBuffer defaults to the current buffer, but also accepts path / bufferId for explicit full-buffer rewrites.
  • saveBuffer writes the current buffer or an explicit open path / bufferId, with responseMode:"compact" and responseMode:"summary" available when the saved file contents would be noisy.
  • checkpoint-save after applyEditBatch, formatBuffer, organizeImports, after the final edit in a same-file chain, before searchText, and before verifyWorkspace when it needs saved disk state.
  • reloadBuffer reloads a clean buffer from disk and refuses dirty buffers.
  • discardBufferChanges is the explicit destructive path for unsaved edits.
  • deleteFile removes a file from disk and closes matching open buffers first; use force:true to discard unsaved buffer changes.
  • moveFile handles rename/move without dropping to the shell, and reuses an open source buffer when one already exists.
  • listOpenBuffers shows open file-backed buffers with dirty and current state, and accepts an optional root to scope the listing to one workspace.
  • saveAllBuffers saves every dirty file-backed buffer in scope and reports saved, skipped, and failed. Pass root to ignore unrelated repos in the same Neovim session.
  • saveAllBuffers also accepts responseMode:"compact" or responseMode:"summary" to omit saved file contents from the saved entries while keeping save metadata.
  • saveAllBuffers is the preferred multi-file checkpoint when observers should see git diff update across a coordinated change set.
  • applyMultiFileBatch is the preferred explicit-target transaction-like flow when several files should share one preflight and optional checkpoint.
  • Bare current-buffer operations are convenient for single-threaded work, but they are not the safest coordination primitive when multiple reads or edits may overlap. Prefer explicit path or bufferId targeting in longer agent sessions.
  • applyHunk is usually the better retry path when a context-preserving change becomes awkward to express as one exact snippet.
  • When multiple edit strategies are equally safe, prefer replaceExact, replaceLines, applyHunk, or applyEditBatch over replaceBuffer to avoid unnecessary full-buffer payloads.

Scoped multi-file session example:

cat <<'EOF' | pnpm exec tsx tools/server.ts
{"id":16,"method":"workspaceStatus","params":{"root":"sample-workspace","dirtyOnly":true}}
{"id":17,"method":"saveAllBuffers","params":{"root":"sample-workspace"}}
EOF

Project-scoped diagnostics example:

cat <<'EOF' | pnpm exec tsx tools/server.ts
{"id":18,"method":"getDiagnostics","params":{"root":"sample-workspace","changedOnly":true}}
EOF

Troubleshooting:

  • stale changedTick: call getBuffer for the same target, take the returned changedTick, and retry the mutation with that value. The error now includes requestedTarget, activeBuffer, expectedChangedTick, and actualChangedTick.
  • newline-sensitive snippet miss: prefer replaceLines or applyHunk when replaceExact reports a trailing newline mismatch.
  • reload versus discard: use reloadBuffer only for clean buffers; use discardBufferChanges when you want to throw away unsaved edits.
  • multi-file session: call listOpenBuffers(root=...) before targeted saves, and use path / bufferId targeting instead of relying on whichever buffer is currently focused.
  • discovery failure: findFiles and searchText now validate root up front and include resolved root, cwd, and ripgrep command details in their errors.
  • unexpected search results after an edit: searchText only sees on-disk content, so save first if you need the latest buffer contents to appear in discovery results.

Common recovery actions:

Situation Recovery
stale changedTick call getBuffer for the same target, then retry with the returned changedTick
exact snippet keeps drifting switch from replaceExact to applyHunk or replaceLines
dirty buffer blocks reload choose explicitly between saveBuffer, saveAllBuffers, or discardBufferChanges
search misses fresh edits save before calling searchText because it is disk-backed

Wait a moment for LSP to refresh, then confirm the diagnostics clear:

{ printf '%s\n' '{"id":19,"method":"openFile","params":{"path":"sample-workspace/src/user.ts"}}'; sleep 3; printf '%s\n' '{"id":20,"method":"getDiagnostics"}'; } | pnpm exec tsx tools/server.ts

Run verification with an explicit command list:

printf '%s\n' '{"id":21,"method":"verifyWorkspace","params":{"root":"'"$(pwd)"'","commands":[{"label":"tests","cmd":"pnpm test","login":true}],"includeRepoHygiene":false}}' | pnpm exec tsx tools/server.ts

verifyWorkspace returns sequential command results with cwd, shell, login, stdout, stderr, code, and ok, alongside diagnostics and optional repo hygiene.

Codex setup

The supported Codex path for external tools is the Codex bootstrap at tools/codex-mcp.ts, which starts an isolated Neovim runtime when needed and then serves the MCP tool surface over stdio.

  1. Open this repository in the Codex desktop app and use a local environment for the workspace.

  2. Start the headless Neovim process inside that workspace:

    pnpm run start:nvim
    

    This is optional for the Codex-managed path. pnpm run start:nvim now uses the same repo-local XDG runtime layout under tmp/nvim-xdg as the managed bootstrap, so plugin state and caches stay isolated from your global Neovim profile.

  3. Configure Codex to launch the Codex MCP bootstrap from the same workspace:

    pnpm run dev:codex-mcp
    
  4. Once connected, Codex will see these tools: setup_agents, open_file, save_buffer, reload_buffer, discard_buffer_changes, list_open_buffers, workspace_status, save_all_buffers, save_files, claim_files, release_files, get_file_owners, create_checkpoint, get_dirty_owned_files, get_buffer, replace_exact, replace_lines, apply_hunk, apply_edit_batch, apply_multi_file_batch, replace_buffer, format_buffer, organize_imports, suggest_edit_retry, create_file, create_files, create_directory, delete_file, move_file, find_files, search_text, definition, references, get_diagnostics, workspace_hygiene, verify_workspace, verify_checkpoint.

  5. Codex MCP launches now create an isolated Neovim socket under tmp/ automatically when NVIM_SOCKET is unset. Each Codex or MCP server process owns one explicit Neovim session and keeps downstream tool attachment scoped to that session instead of relying on ambient process.env.NVIM_SOCKET lookups. Managed and manual launches also share repo-local XDG paths under tmp/nvim-xdg, so lazy.nvim state, Treesitter parsers, and other plugin caches remain workspace-scoped. Set NVIM_SOCKET yourself only when you want bootstrap to attach to a manually managed runtime. Parallel Codex or MCP server processes stay isolated by socket by default. Generated runtimes are capped at 12 per workspace by default. For heavier multi-agent fanout, set VIGENTIC_MAX_GENERATED_NVIM_INSTANCES to a higher value, or to 0 to disable the cap entirely. When the cap is hit, Vigentic will first try to evict the oldest idle generated runtime that has no dirty file-backed buffers, then start a fresh runtime instead of reusing another agent's editor session. Tune idleness with VIGENTIC_GENERATED_NVIM_IDLE_THRESHOLD_MS. If old managed runtimes accumulate, run pnpm run cleanup:nvim. It reaps this repo's generated nvim-agent-*.sock sessions once they are older than 24 hours, skips the fixed manual ./tmp/nvim-agent.sock, and leaves generated runtimes alone when they still have dirty listed buffers. Preview with pnpm run cleanup:nvim -- --dry-run, adjust the age gate with VIGENTIC_CLEANUP_MIN_AGE_HOURS=12, and add extra scan roots with VIGENTIC_CLEANUP_SOCKET_DIRS=/path/one:/path/two. The command is intended to be safe to run from cron, launchd, or a Codex automation.

v2 to v3 migration

  • replace_range is retired from the public MCP surface. Use replace_exact for exact snippet edits or apply_hunk for context-based diff application.
  • insert_line and delete_range are retired from the public MCP surface. Express those changes through apply_hunk or fall back to replace_buffer when a full rewrite is clearer.
  • save_file is removed from the public MCP surface. Use save_buffer, reload_buffer, or discard_buffer_changes explicitly.
  • save_buffer and save_all_buffers now accept responseMode:"compact" and responseMode:"summary" when callers only need save metadata and want quieter responses.
  • save_files, claim_files, release_files, get_file_owners, create_checkpoint, get_dirty_owned_files, and verify_checkpoint add explicit ownership-aware multi-agent coordination primitives.
  • list_open_buffers and save_all_buffers are available for multi-file workflows and accept optional root scoping.
  • workspace_status now provides repo-scoped open-buffer, ownership, dirty-state, buffer-vs-disk match state, current-buffer, and latest save/checkpoint visibility in one response.
  • get_buffer now accepts optional path / bufferId targeting for non-current reads.
  • replace_lines now sits between replace_exact and replace_buffer for line-oriented edits.
  • apply_edit_batch can serialize several same-buffer edits behind one initial expectedChangedTick check and now preflights the full chain before mutating the buffer.
  • apply_multi_file_batch adds a stateless explicit-target preview/apply flow for multi-file batches, with optional checkpoint saves for the touched buffers.
  • useLatestChangedTick:true is available on mutating edit tools as a current-buffer-only convenience mode. It cannot be combined with expectedChangedTick, path, or bufferId.
  • useLatestSessionChangedTick:true is available on mutating edit tools as an explicit-target convenience mode for explicit path targets and already-open bufferId targets. It cannot be combined with expectedChangedTick or useLatestChangedTick:true.
  • apply_edit_batch rejects invalid edits[...] items with examples for the supported replace_exact, replace_lines, and apply_hunk object shapes.
  • edit-tool failure messages now include compact recovery hints that point agents toward get_buffer, replace_lines, apply_hunk, or apply_edit_batch when exact matching drifts.
  • suggest_edit_retry turns a failed edit message, the attempted args, and an optional fresh buffer snapshot into a structured retry recommendation with suggested tool args.
  • setup_agents scaffolds or refreshes a managed Vigentic block in AGENTS.md and creates docs/agent-usage.md only when that guide is missing.
  • replace_buffer now accepts optional path / bufferId, requires expectedChangedTick, and returns the same chainable result shape as the other mutating edit tools.
  • format_buffer and organize_imports join the public MCP surface as explicit semantic mutations. They require expectedChangedTick, return the same observer-friendly mutation shape as the edit tools, and do not save implicitly.
  • Mutating edit tools accept optional includeSnapshot:false when only diff and nextChangedTick are needed.
  • open_file no longer creates missing files as a side effect. Use create_directory plus create_file for one-off scaffolding or create_files for batched scaffolding writes.
  • create_files batches ordered scaffolding writes, preserves create_file's overwrite semantics, and optionally supports createParents:true for recursive parent-directory creation.
  • delete_file and move_file cover the main file lifecycle cleanup path that previously required shell fallback.
  • find_files defaults to project-oriented discovery and accepts mode:"raw" when hidden internals should be returned.
  • Repo exploration and code search should prefer find_files and search_text; operational shell work that does not depend on editor state still belongs in the shell.
  • search_text is disk-backed and does not include unsaved buffer contents.
  • get_diagnostics can target the current buffer, a single path, or open buffers under a root, with changedOnly:true for dirty-buffer checks, plus additive origin / confidence labels and optional low-confidence filtering.
  • verify_workspace is the only public command-execution path. Use its commands list for build/test checks, or use the shell directly for operational work that should not be exposed as a generic tool.
  • verify_workspace command entries default cwd to root, else dirname(path), else the current buffer directory when cwd is omitted.
  • verify_workspace can now enforce saved-state guardrails with agentId, checkpointId, and relevantPaths; it skips command execution when relevant buffers are dirty or checkpoint files are stale on disk.
  • verify_workspace includes repo hygiene by default; those artifact findings come from filesystem scans rather than .gitignore, so expected build output such as dist/ can still appear unless includeRepoHygiene:false is set.
  • workspace_hygiene summarizes modified/untracked files plus common generated artifact leaks such as dist/, .vite/, vite.config.js, and *.tsbuildinfo, while suppressing common dependency trees like node_modules/ by default unless includeDependencyTrees:true is requested.
  • verify_workspace combines diagnostics, sequential command execution, and optional repo hygiene into one verification result. Command results are the final build/test gate; diagnostics localize actionable issues.
  • verify_checkpoint validates a named checkpoint against disk before running commands so verifiers can prove exactly what saved state was checked.

Deferred work

  • Frontend runtime inspection is not part of the current MCP surface yet.
  • Browser or DOM-state hooks should be treated as a follow-up scope, not something agents should assume exists today.

What is official versus POC-specific:

  • Official Codex docs cover the app quickstart, local environments, commands, and MCP integration entry points.
  • The repo-specific part here is the command you give Codex to launch the MCP server for this project.
  • There is no separate custom Codex plugin API in this repo; MCP is the integration surface.

Official references:

Status and next work

  • The POC is verified for open/edit/save/diagnostics/test flow in the sample TypeScript workspace.

  • Discovery is resilient in empty roots, mutating edits are chainable, workspace-scoped buffer operations are supported, file lifecycle operations no longer require shell fallback, and command-based verification is scoped through verify_workspace.

  • A benchmark runner is available with:

    pnpm run bench:run
    
  • Optional benchmark controls:

    BENCH_ITERATIONS=10 BENCH_SESSION_EDITS=5 BENCH_MCP_READ_ITERATIONS=20 pnpm run bench:run
    
  • Benchmark outputs are written to benchmarks/results/latest.json and benchmarks/results/latest.md.

  • The benchmark reports three cases:

    • cold start: fresh Neovim per task
    • warm single edit: one prewarmed Neovim runtime reused across tasks
    • warm multi-edit session: one prewarmed Neovim runtime reused across several edits in the same task
  • The benchmark also reports:

    • MCP resource read latencies with spread and range
    • MCP stdio transport overhead for a representative get_buffer call
    • confidence notes when iteration counts are too low to make strong claims
  • The benchmark script starts and stops its own headless Neovim process, so it does not require you to keep extra terminals open.

  • The benchmark is intended to show direction, not just averages: newer reports include standard deviation and range so false positives are easier to spot.

  • A summary of the current benchmark interpretation lives in benchmarks/FINDINGS.md.

Current Findings

Current benchmark shape from benchmarks/results/latest.md:

  • scenario tables report average ± standard deviation plus min/max range for each metric
  • MCP resource tables report both average latency and spread instead of a bare mean
  • a dedicated MCP transport section reports direct in-process getBuffer, MCP stdio get_buffer, estimated overhead, and connect cost
  • confidence notes explicitly warn when iteration counts are too low to support strong claims

This is the expected outcome for an IDE-kernel design:

  • cold start costs are real
  • warmed Neovim plus warmed LSP is where the model pays off
  • the larger the task, the stronger the advantage from reusing the same runtime
  • transport overhead should be measured separately from warmed-kernel gains

Measured local MCP stdio overhead:

Practical conclusion:

  • MCP transport overhead is usually small relative to LSP, diagnostics, typecheck, and test work
  • Neovim/LSP warmup and tool granularity matter much more than MCP itself
  • do not attribute all warm-path gains to MCP unless the transport section and confidence notes support that conclusion
  • this architecture is better aligned with medium and larger coding tasks than trivial one-shot edits

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured