Grafyx
Enables AI coding assistants to understand codebase architecture in real time by parsing source code into a relationship graph and exposing call chains, dependencies, class hierarchies, and conventions via MCP tools.
README
Grafyx
Real-time codebase understanding for AI coding assistants.
What is Grafyx?
AI coding tools read raw files with zero architectural understanding -- they don't know what calls what, which classes inherit from where, or how your modules connect. Grafyx fixes this by parsing your entire codebase into a full relationship graph using Graph-sitter (built on tree-sitter), then exposing that graph to any AI assistant through the Model Context Protocol (MCP). Your assistant can trace call chains, map dependencies, find related code by description, detect conventions, and understand your project's architecture -- all in real time, with a file watcher that keeps the graph current as you edit.
Quick Start
Claude Code
# Zero-install (recommended)
claude mcp add --scope user grafyx -- uvx --from grafyx-mcp grafyx
# Or install with pip first
pip install grafyx-mcp
claude mcp add --scope user grafyx -- grafyx
Cursor / Windsurf / Cline
Add to your MCP config file:
- Cursor:
.cursor/mcp.json(project) or~/.cursor/mcp.json(global) - Windsurf:
~/.codeium/windsurf/mcp_config.json - Cline: Cline MCP settings in VS Code
{
"mcpServers": {
"grafyx": {
"command": "uvx",
"args": ["--from", "grafyx-mcp", "grafyx"]
}
}
}
VS Code (GitHub Copilot)
Add to .vscode/mcp.json:
{
"servers": {
"grafyx": {
"command": "uvx",
"args": ["--from", "grafyx-mcp", "grafyx"]
}
}
}
Using pip instead of uvx? Replace the command with:
"command": "grafyx"(no args needed).
Available Tools
| Tool | Description |
|---|---|
get_project_skeleton |
Full project structure with stats per module |
get_function_context |
Everything about a function: callers, callees, deps |
get_file_context |
File contents, imports, dependencies |
get_class_context |
Class methods, inheritance, usages |
find_related_code |
Natural language search across the codebase |
find_related_files |
Find files relevant to a feature by matching symbols |
get_dependency_graph |
Impact analysis: what depends on what |
get_conventions |
Detected coding patterns and conventions |
get_call_graph |
Call chain tracing upstream and downstream |
refresh_graph |
Force re-parse of the codebase |
get_module_context |
Symbols in a directory/package (intermediate zoom) |
get_subclasses |
Inheritance tree for a base class |
get_unused_symbols |
Dead code detection |
set_project |
Switch the served project at runtime |
How It Works
Your AI Assistant
|
| MCP Protocol (stdio)
v
+-----------+
| Grafyx | FastMCP server with 14 tools
| Server |
+-----------+
|
+-----------+ +-----------+ +-------------+
| Graph |---->| Search | | Convention |
| Engine |---->| Engine | | Detector |
+-----------+ +-----------+ +-------------+
|
v
+-----------+
| Graph- | Tree-sitter based parsing
| sitter |
+-----------+
|
+-----------+
| Watchdog | File watcher for live updates
+-----------+
- Startup -- Grafyx detects languages in your project and parses all source files into a semantic graph via Graph-sitter.
- Serving -- The FastMCP server exposes 14 tools over stdio. Your AI assistant calls them as needed.
- Live updates -- Watchdog monitors file changes. When you save, the graph is automatically re-parsed after a short debounce.
ML-augmented search
Grafyx's find_related_code uses a pretrained code embedding model (default:
jinaai/jina-embeddings-v2-base-code, Apache-2.0, 161M params) running on CPU
via ONNX through fastembed. The model
is downloaded on first use and cached locally — no GPU, no daemon, no cloud
calls.
Since 0.2.1, fastembed is a hard dependency, so the default install
already includes the encoder — no extra needed.
Benchmark (0.2.0, 278 docstring→function queries across FastAPI + Django):
| Encoder | nDCG@10 | MRR@10 | p50 latency |
|---|---|---|---|
| jina-v2 (default) | 0.787 | 0.741 | ~1.5 s |
| coderankembed | 0.663 | 0.623 | ~1.3 s |
| tokens-only (no fastembed) | 0.335 | 0.297 | ~0.9 s |
The default encoder more than doubles retrieval quality over plain source-token search (+135% nDCG@10).
Full breakdown + per-query JSONL: docs/benchmarks/0.2.0/.
Switch encoders via the GRAFYX_ENCODER env var:
jina-v2(default) — Apache-2.0, fastembed-native, ~150 MB. Wins on accuracy; recommended unless you have a specific reason to switch.coderankembed— MIT, 137M, ONNX-int8, ~140 MB. Lower latency but ~12 nDCG@10 points behind jina-v2 in our eval. Hosted atBilal7Dev/grafyx-coderankembed-onnx.
Supporting numpy-only MLPs (~5 MB total weights, bundled in the wheel):
- M1 Relevance ranker — 33-feature MLP that re-ranks the encoder's top candidates using structural signals (caller count, name overlap, exports).
- M3 Source token filter — suppresses noise tokens (imports, strings, magic methods) from full-text search.
- M4 Symbol importance — weights symbols by caller count, exports, and structural signals.
- Gibberish detector — character-bigram MLP that blocks nonsense queries before they hit the index.
Reproducible benchmarks against FastAPI, Django, and Home Assistant ship in
benchmarks/ (python -m scripts.run_all).
Supported Languages
| Language | Extensions |
|---|---|
| Python | .py, .pyi |
| TypeScript | .ts, .tsx |
| JavaScript | .js, .jsx |
Languages are auto-detected. To specify manually:
grafyx --languages python,typescript
Options
grafyx [OPTIONS]
--project PATH Project to analyze (default: current directory)
--languages LANGS Comma-separated languages (default: auto-detect)
--ignore PATTERNS Additional directories to ignore
--no-watch Disable file watching
--verbose, -v Debug logging
--version Show version
Default ignored: node_modules, .git, __pycache__, .venv, venv, .env, dist, build, .tox, .mypy_cache, .pytest_cache, .ruff_cache, egg-info, .eggs, .next, .nuxt, coverage, .coverage, .nyc_output
Multi-Agent Support
Grafyx works with agent teams. A single Grafyx instance serves all agents connected to the same project. When one agent modifies code, the file watcher updates the graph automatically, so other agents immediately see the changes.
Contributing
git clone https://github.com/bilal07karadeniz/Grafyx.git
cd Grafyx
pip install -e ".[dev]"
pytest
Troubleshooting
Windows: Graph-sitter requires Linux. Use WSL and configure your MCP client to launch via wsl:
{
"mcpServers": {
"grafyx": {
"command": "wsl",
"args": ["-e", "bash", "-c", "source ~/your-venv/bin/activate && grafyx"]
}
}
}
License
MIT -- see LICENSE for details.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.