astrograph

astrograph

An MCP server that detects duplicate code using AST graph isomorphism, blocking writes of structurally identical code to reduce redundancy.

Category
Visit Server

README

ASTrograph

<p align="center"> <img src="astrograph_poster.jpg" alt="ASTrograph" width="400"> </p>

CI Release Docker Arch License MCP

An MCP server that helps AI agents detect duplicate code before writing it. It provides write and edit tools that compare new code against existing functions in your codebase using AST graph isomorphism — powered by algorithms, not LLM tokens. When a structural duplicate is found, the operation is blocked with a pointer to the existing code. Variable names, formatting, and comments are ignored — if two pieces of code share the same abstract structure, ASTrograph flags them as duplicates.

Installation

Add .mcp.json to your project root:

{
  "mcpServers": {
    "astrograph": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i", "--pull", "missing",
        "--add-host", "host.docker.internal:host-gateway",
        "-v", ".:/workspace",
        "thaylo/astrograph:latest"
      ]
    }
  }
}

The image is multi-arch (amd64, arm64). The codebase is indexed at startup. Metadata is stored outside the project directory (in the user data dir) so it never interferes with your codebase.

To update to a new release:

docker pull thaylo/astrograph:latest

The running version is always visible in the MCP serverInfo.version field on connect.

<details> <summary><strong>Claude Desktop</strong></summary>

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "astrograph": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i", "--pull", "missing",
        "--add-host", "host.docker.internal:host-gateway",
        "-v", "/absolute/path/to/project:/workspace",
        "thaylo/astrograph:latest"
      ]
    }
  }
}

</details>

<details> <summary><strong>Codex</strong></summary>

~/.codex/config.toml:

[mcp_servers.astrograph]
command = "docker"
args = [
  "run", "--rm", "-i", "--pull", "missing",
  "--add-host", "host.docker.internal:host-gateway",
  "-v", "/absolute/path/to/project:/workspace",
  "thaylo/astrograph:latest"
]

</details>

<details> <summary><strong>wmark</strong></summary>

~/.config/wmark/.mcp.json (user-level, applies to all projects on macOS):

{
  "mcpServers": {
    "astrograph": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i", "--pull", "missing",
        "--add-host", "host.docker.internal:host-gateway",
        "-v", "/Users:/Users:rw",
        "thaylo/astrograph:latest"
      ]
    }
  }
}

Mounting /Users makes all macOS home paths accessible inside the container unchanged. Call set_workspace with the full host path (e.g. /Users/yourname/project) to index a project.

For Linux, replace /Users:/Users:rw with /home:/home:rw.

</details>

<details> <summary><strong>Without Docker</strong></summary>

pip install .
{
  "mcpServers": {
    "astrograph": {
      "command": "python",
      "args": ["-m", "astrograph.server"],
      "cwd": "/path/to/astrograph"
    }
  }
}

</details>

How it works

Your codebase already contains:

# src/math.py
def calculate_sum(a, b):
    return a + b

An AI agent tries to write:

# src/utils.py
def add_numbers(x, y):
    return x + y

ASTrograph detects the duplicate and blocks the write:

BLOCKED: Cannot write - identical code exists at src/math.py:calculate_sum (lines 1-2).
Reuse the existing implementation instead.

Different variable names, identical structure. Source code is converted into labeled directed graphs and compared using Weisfeiler-Leman hashing with VF2 isomorphism verification — all algorithmic, no LLM tokens spent on the search.

Detection types

ASTrograph detects four types of structural duplication:

Type What it catches How it works
Exact Identical AST structure with renamed variables or different formatting WL hash identity + VF2 graph isomorphism verification
Pattern Same control flow with different operators or constants Operator-normalized graph hashing
Block Duplicate inner blocks (for/if/while/try) within functions Block-level AST extraction + hash matching
Near-duplicate ~80% structural similarity — copy-paste-modify patterns Hierarchy hash prefix matching at 4/5 depth levels

Near-duplicate detection catches Type-3 clones that exact and pattern detection miss. For example, Flask's TagBytes, TagDateTime, TagTuple, and TagUUID classes share 80%+ identical structure but differ in leaf-level details.

Language support

Python, JavaScript, and TypeScript work out of the box. C, C++, Java, and Go attach to an already-running language server over TCP.

Language Versions Mode Default endpoint
Python 3.11 -- 3.14 bundled pylsp
JavaScript ES2021+, Node 20/22/24 LTS bundled typescript-language-server --stdio
TypeScript TypeScript 5.x, Node 20/22/24 LTS bundled typescript-language-server --stdio
Go 1.21 -- 1.25 attach tcp://127.0.0.1:2091
C C11, C17, C23 attach tcp://127.0.0.1:2087
C++ C++17, C++20, C++23 attach tcp://127.0.0.1:2088
Java 11, 17, 21, 25 attach tcp://127.0.0.1:2089

The Docker image bundles Python and JS/TS LSP runtimes. For attach-based languages, expose the language server on a TCP port using socat and configure via your MCP JSON:

{
  "mcpServers": {
    "astrograph": {
      "command": "docker",
      "args": ["run", "--rm", "-i", "--add-host", "host.docker.internal:host-gateway", "-v", ".:/workspace", "thaylo/astrograph:latest"],
      "env": {
        "ASTROGRAPH_CPP_LSP_COMMAND": "tcp://host.docker.internal:2088",
        "ASTROGRAPH_GO_LSP_COMMAND": "tcp://host.docker.internal:2091",
        "ASTROGRAPH_JAVA_LSP_COMMAND": "tcp://host.docker.internal:2089",
        "ASTROGRAPH_C_LSP_COMMAND": "tcp://host.docker.internal:2087"
      }
    }
  }
}
Language Env var Socat bridge example
C ASTROGRAPH_C_LSP_COMMAND socat TCP-LISTEN:2087,reuseaddr,fork EXEC:clangd
C++ ASTROGRAPH_CPP_LSP_COMMAND socat TCP-LISTEN:2088,reuseaddr,fork EXEC:clangd
Java ASTROGRAPH_JAVA_LSP_COMMAND socat TCP-LISTEN:2089,reuseaddr,fork EXEC:jdtls
Go ASTROGRAPH_GO_LSP_COMMAND socat TCP-LISTEN:2091,reuseaddr,fork EXEC:"gopls serve"
Python ASTROGRAPH_PY_LSP_COMMAND (bundled, override if needed)
JS ASTROGRAPH_JS_LSP_COMMAND (bundled, override if needed)
TS ASTROGRAPH_TS_LSP_COMMAND (bundled, override if needed)

Run lsp_setup(mode='inspect') to see which languages are available and what's missing.

Real-world results

Tested on popular open-source projects:

Project Language Files Code Units Duplicates Found
Redis C 208 18,272 556 groups
TypeORM TypeScript 492 7,107 511 groups
Express.js JavaScript 141 3,866 468 groups
nlohmann/json C++ 488 9,103 959 groups
Gin Go 99 1,557 141 groups
Flask Python 24 910 48 groups
Spring PetClinic Java 47 270 17 groups

Exact, pattern, and block findings are verified via VF2 graph isomorphism. Near-duplicates are matched via hierarchy hash prefix similarity (~80% structural identity).

Star History

Star History Chart

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured