codeweaver-mcp

codeweaver-mcp

Token-efficient MCP server for multi-language project analysis (Java, TypeScript, JavaScript, Markdown, Python) with plugins, semantic search, and static analysis.

Category
Visit Server

README

CodeWeaver ๐Ÿ•ธ๏ธ

โš ๏ธ Beta Release (v0.6.0) - Static Analysis + Code Cleanup! SpotBugs & Checkstyle integration, all packages updated, codebase fully cleaned. PMD & SonarLint coming soon!

Token-efficient MCP server for Multi-Language project analysis (Java, TypeScript, JavaScript, Markdown, Python)

Weaving Code Intelligence for LLMs - A lightweight Model Context Protocol server that provides token-efficient access to Java, TypeScript, JavaScript, Markdown, and Python files through a multi-agent architecture with language plugins.

โšก Highlights

  • โœ… Zero Native Dependencies* - Pure Node.js/TypeScript with language-specific parsers
  • โœ… Dual Interface - CLI tool AND MCP server from same codebase
  • โœ… Token-Efficient - Smart file reading with token limits
  • โœ… Multi-Language Support ๐Ÿ†• - Java, TypeScript, JavaScript, Markdown, AND Python with unified plugin architecture!
    • ๐ŸŽฏ Complete Java Support (Java 8-23) - Records, Sealed Classes, Module System
    • ๐ŸŽฏ Complete TypeScript Support - Classes, Interfaces, Types, Enums, Generics, Decorators
    • ๐ŸŽฏ Complete JavaScript Support - Modern ES6+, JSX, Arrow Functions, Async/Await
    • ๐ŸŽฏ Complete Markdown Support - Headers as Sections, Links as References, Code Blocks
    • ๐ŸŽฏ Complete Python Support ๐Ÿ†• - Classes, Functions, Methods, Decorators, Type Hints, Async/Await
    • โœ… Class-Level Annotations/Decorators - Spring, JPA, Jakarta EE, TypeScript decorators, Python @decorators
    • โœ… Method Parameters - Names, types, and annotations extracted
    • โœ… Generic Type Parameters - Full signature with bounds (Java, TypeScript)
    • โœ… Language Field - Every symbol tagged with source language
    • โœ… Easy Extensibility - Plugin system for adding new languages
  • โœ… Powerful Search - Keyword, pattern, AND semantic search (AI-powered) ๐Ÿ†•
  • โœ… Semantic Code Search - Find code by meaning/intent using LanceDB + Transformers ๐Ÿ†•
    • โšก ONNX Runtime Optimizations - Multi-threading + SIMD for 3x faster embeddings! ๐Ÿ†•
    • โšก 16x faster with Batch-Processing - 10k files in ~10 min (was 8h!)
    • ๐ŸŽฏ Multi-Collection Support - Separate indexes for Code AND Docs! ๐Ÿ†•
    • ๐Ÿ” File Watcher - Automatic incremental updates on file changes! ๐Ÿ†•
    • ๐Ÿ“– SEMANTIC_SEARCH.md - Comprehensive guide with workflows and best practices
    • ๐ŸŽฏ MULTI_COLLECTION_GUIDE.md - Multi-collection usage guide (Code + Docs)
    • ๐Ÿ” FILE_WATCHER_GUIDE.md - Keep your index always up-to-date!
    • ๐Ÿš€ PERFORMANCE_OPTIMIZATION.md - Future optimizations (GPU acceleration)
  • โœ… Code Quality Analysis - Cyclomatic complexity, LOC metrics, import analysis
  • โœ… Static Analysis ๐Ÿ†• - SpotBugs (bugs), Checkstyle (style) with plugin architecture!
    • ๐Ÿ”ฌ SpotBugs - Finds NullPointerExceptions, Resource Leaks, SQL Injections
    • โœ… Checkstyle - Code style enforcement, naming conventions
    • ๐Ÿ”Œ Plugin Architecture - Easy to add PMD, SonarLint later
  • โœ… Git Integration - Status, diff, blame, log, branches, compare
  • โœ… Test-Driven - 291 tests passing (100%) - All features fully tested! ๐Ÿ†•

* Core features (Discovery, Symbols, Search, Analysis, VCS) have zero native dependencies. Semantic Search optionally requires LanceDB + ONNX Runtime (native components).


๐Ÿš€ Quick Start

# Install
npm install

# Build
npm run build

# Use as CLI
npm run dev -- info
npm run dev -- symbols index
npm run dev -- search keyword "CodeWeaver"
npm run dev -- analysis project

# Use as MCP Server
npm run dev -- --mcp

๐Ÿ“– Documentation

๐Ÿ“š Complete Documentation Index - Full navigation of all CodeWeaver documentation

Quick Links:

Category Documents Description
๐Ÿš€ Getting Started QUICKSTART โ€ข WORKFLOW โ€ข PRODUCTION Installation, tutorials, deployment
๐Ÿ“– Reference API โ€ข USAGE โ€ข PERFORMANCE API reference, benchmarks, usage guide
๐ŸŽฏ Guides SEMANTIC SEARCH โ€ข MULTI-COLLECTION โ€ข FILE WATCHER Advanced feature guides
๐Ÿ—๏ธ Architecture ARCHITECTURE โ€ข DATA MODELS โ€ข TOKEN MGMT System design, technical deep-dives
๐Ÿ‘จโ€๐Ÿ’ป Development CONTRIBUTING โ€ข TESTING โ€ข ROADMAP For contributors
๐Ÿ“ฆ Project CHANGELOG โ€ข CODE OF CONDUCT Release notes, governance
๐Ÿ“ Other GLOSSARY Terms & acronyms

New to CodeWeaver? Start with QUICKSTART.md (5 minutes) or DEVELOPER_WORKFLOW.md (20 minutes).


๐Ÿ“ฆ Current Features (Phase 1 + 2 + 3 + 4 Complete)

โœ… Implemented

Agents:

  • Project Metadata Agent ๐Ÿ†• - Multi-language project metadata extraction with plugin architecture
    • Gradle: Java version, dependencies, plugins, modules
    • npm: TypeScript/JavaScript, package manager detection, scripts, workspaces
    • Auto-detection: Automatically detects project type(s)
    • Unified Schema: Language-agnostic metadata format
    • Extensible: Easy to add new build systems (pip, Maven, Cargo, etc.)
  • System Check Agent ๐Ÿ†• - Dependency validation and system health checks
    • Critical: Node.js (>=18), Git (>=2.0) - required for core features
    • Optional: Python (>=3.8), Gradle (>=7.0), Maven (>=3.6)
    • Auto-check: Quick startup validation in CLI mode
    • Doctor Command: Full system diagnostic with recommendations
  • Cache Agent - Content-addressable caching with SHA-256 hashing
  • Snippets Agent - Token-efficient file reading with line ranges
  • Symbols Agent - Multi-language symbol extraction with plugin architecture ๐Ÿ†•
    • Java: Classes, Interfaces, Enums, Records, Annotation Types, Sealed Classes, Module System
    • TypeScript: Classes, Interfaces, Types, Enums, Functions, Generics, Decorators, Namespaces
    • JavaScript: Classes, Functions, Arrow Functions, Async/Await, ES6+ features
    • Markdown: Headers as Sections, Local Links as References, Code Blocks
    • Python ๐Ÿ†•: Classes, Functions, Methods, Decorators, Type Hints, Async/Await
    • Methods with parameters, generics, and annotations/decorators
    • Fields/Properties with modifiers and visibility
    • Constructors, nested types, enum constants
    • Language-tagged symbols for easy filtering
  • Search Agent - Keyword and pattern search with file filtering
  • Analysis Agent - Cyclomatic complexity, LOC metrics, code quality
  • VCS Agent - Git operations (status, diff, blame, log, branches, compare)
  • Semantic Index Agent - LanceDB vector search with multi-collection support ๐Ÿ†•
  • File Watcher Agent - Automatic incremental index updates on file changes ๐Ÿ†•
  • Symbol Storage - In-memory symbol index with JSON Lines persistence

MCP Tools (19 total):

File & Project:

  • project.meta - Get unified project metadata (auto-detects: Gradle, npm, pip, Maven, etc.)
    • Multi-language support with plugin architecture
    • Optional projectType parameter for specific extraction
  • file.read - Read file with optional token limit (default: 10000)
  • file.readRange - Read specific line ranges (1-indexed, inclusive)
  • file.readWithNumbers - Read file with line numbers for reference

Symbols:

  • symbols.index - Index entire project and extract symbols
  • symbols.find - Find symbols by name (case-insensitive substring)
  • symbols.findByKind - Find symbols by kind (class/method/field/constructor)
  • symbols.get - Get symbol details by qualified name

Search:

  • search.keyword - Search for keyword in files (grep-like)
  • search.files - Find files by name pattern (glob-like: *.java)

Analysis:

  • analysis.file - Analyze single file for complexity and metrics
  • analysis.project - Analyze entire project for statistics

Version Control:

  • vcs.status - Get Git repository status
  • vcs.diff - Get diff for file(s)
  • vcs.blame - Get Git blame information for file
  • vcs.log - Get commit history
  • vcs.branches - Get list of all branches
  • vcs.compare - Compare two branches

System:

  • system.check - Check system dependencies (planned for MCP integration)

CLI Commands:

Info & Files:

  • codeweaver info - Display project information
  • codeweaver file read <path> [--limit N] [--numbers] - Read files
  • codeweaver file range <path> <start> <end> - Read line ranges
  • codeweaver file context <path> <line> [-c N] - Get context around line

Symbols:

  • codeweaver symbols index - Index project and extract symbols
  • codeweaver symbols find <name> - Find symbols by name
  • codeweaver symbols get <qualifiedName> - Get symbol details
  • codeweaver symbols list <kind> - List all symbols of a kind

Search:

  • codeweaver search keyword <keyword> [-i] [-m N] [-c N] [-e .ext] - Keyword search
  • codeweaver search files <pattern> - Find files by pattern
  • codeweaver search semantic <query> [--index] [-c collection] [-l N] - Semantic search ๐Ÿ†•

Analysis:

  • codeweaver analysis file <path> - Analyze file complexity and metrics
  • codeweaver analysis project [--top N] - Analyze project statistics
  • codeweaver analysis complexity <path> - Show complexity breakdown

Version Control:

  • codeweaver vcs status - Show Git repository status
  • codeweaver vcs diff [file] - Show diff for file(s)
  • codeweaver vcs blame <file> [-l <range>] - Show Git blame
  • codeweaver vcs log [-n N] [--since] [--author] - Show commit history
  • codeweaver vcs branches - List all branches
  • codeweaver vcs compare <base> <compare> - Compare two branches

File Watching: ๐Ÿ†•

  • codeweaver watch [--debounce N] [--code-only] [--docs-only] - Watch files and auto-update index

System Check: ๐Ÿ†•

  • codeweaver doctor - Check system dependencies (Node.js, Git, Python, Gradle, Maven)
  • codeweaver doctor --quick - Quick check (only critical dependencies)

Infrastructure:

  • Auto-detection (stdio = MCP mode, TTY = CLI mode)
  • Progress tracking (JSON Lines format to .codeweaver/progress.jsonl)
  • Checkpoint/resume capability
  • TypeScript strict mode, ESM modules
  • Vitest test framework (73 tests passing)

๐Ÿ—๏ธ Architecture

Multi-Agent System

graph TB
    CLI[CLI Interface<br/>7 Command Groups] --> SERVICE[CodeWeaverService]
    MCP[MCP Server<br/>19 Tools] --> SERVICE

    SERVICE --> DISC[Discovery Agent]
    SERVICE --> CACHE[Cache Agent]
    SERVICE --> SNIP[Snippets Agent]
    SERVICE --> SYM[Symbols Agent]
    SERVICE --> SEARCH[Search Agent]
    SERVICE --> ANALYSIS[Analysis Agent]
    SERVICE --> SEMANTIC[Semantic Index Agent]
    SERVICE --> WATCHER[File Watcher Agent]
    SERVICE --> VCS[VCS Agent]
    SERVICE --> STORE[Symbol Storage]

    DISC --> GRADLE[Gradle Parser]
    CACHE --> SHA[SHA-256 Hashing]
    SNIP --> TOKEN[Token Counter]
    SYM --> PLUGINREG[Plugin Registry]
    PLUGINREG --> JAVAPLUGIN[Java Plugin<br/>java-parser]
    PLUGINREG --> TSPLUGIN[TypeScript Plugin<br/>typescript-estree]
    PLUGINREG --> JSPLUGIN[JavaScript Plugin<br/>typescript-estree]
    PLUGINREG --> MDPLUGIN[Markdown Plugin<br/>remark]
    PLUGINREG --> PYPLUGIN[Python Plugin<br/>tree-sitter WASM]
    SEARCH --> REGEX[Regex Matching]
    ANALYSIS --> COMPLEXITY[Cyclomatic<br/>Complexity]
    SEMANTIC --> LANCEDB[LanceDB<br/>Vector Search]
    SEMANTIC --> ONNX[ONNX Runtime<br/>Embeddings]
    WATCHER --> CHOKIDAR[Chokidar<br/>File Watching]
    WATCHER --> SEMANTIC
    VCS --> GIT[Git Operations]
    STORE --> JSONL[JSON Lines]

    style SERVICE fill:#e1f5ff
    style CLI fill:#d4edda
    style MCP fill:#fff3cd
    style SYM fill:#cfe2ff
    style SEARCH fill:#cfe2ff
    style ANALYSIS fill:#cfe2ff
    style SEMANTIC fill:#fff3cd
    style WATCHER fill:#fff3cd
    style VCS fill:#cfe2ff

Directory Structure

src/
โ”œโ”€โ”€ index.ts                      # Main entry (auto-detection)
โ”œโ”€โ”€ cli/
โ”‚   โ”œโ”€โ”€ index.ts                  # CLI entry point
โ”‚   โ””โ”€โ”€ commands/
โ”‚       โ”œโ”€โ”€ info.ts               # Info command
โ”‚       โ”œโ”€โ”€ file.ts               # File commands
โ”‚       โ”œโ”€โ”€ symbols.ts            # Symbols commands (Phase 2)
โ”‚       โ”œโ”€โ”€ search.ts             # Search commands (Phase 2)
โ”‚       โ””โ”€โ”€ analysis.ts           # Analysis commands (Phase 3)
โ”œโ”€โ”€ mcp/
โ”‚   โ”œโ”€โ”€ index.ts                  # MCP entry point
โ”‚   โ”œโ”€โ”€ server.ts                 # MCPServer class
โ”‚   โ””โ”€โ”€ tools.ts                  # Tool registration (12 tools)
โ”œโ”€โ”€ core/
โ”‚   โ”œโ”€โ”€ service.ts                # Shared business logic
โ”‚   โ”œโ”€โ”€ agents/
โ”‚   โ”‚   โ”œโ”€โ”€ discovery.ts          # Gradle analysis
โ”‚   โ”‚   โ”œโ”€โ”€ cache.ts              # Caching
โ”‚   โ”‚   โ”œโ”€โ”€ snippets.ts           # File reading
โ”‚   โ”‚   โ”œโ”€โ”€ symbols.ts            # Multi-language symbol extraction (Phase 2)
โ”‚   โ”‚   โ”œโ”€โ”€ search.ts             # Keyword/pattern search (Phase 2)
โ”‚   โ”‚   โ”œโ”€โ”€ analysis.ts           # Complexity analysis (Phase 3)
โ”‚   โ”‚   โ”œโ”€โ”€ semantic.ts           # Vector search (Phase 4)
โ”‚   โ”‚   โ”œโ”€โ”€ vcs.ts                # Git operations (Phase 4)
โ”‚   โ”‚   โ””โ”€โ”€ watcher.ts            # File watching (Phase 4)
โ”‚   โ”œโ”€โ”€ language/
โ”‚   โ”‚   โ”œโ”€โ”€ plugin.ts             # LanguagePlugin interface
โ”‚   โ”‚   โ”œโ”€โ”€ detector.ts           # Language detection
โ”‚   โ”‚   โ”œโ”€โ”€ registry.ts           # Plugin registry
โ”‚   โ”‚   โ””โ”€โ”€ plugins/
โ”‚   โ”‚       โ”œโ”€โ”€ java/
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ index.ts      # JavaLanguagePlugin
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ parser.ts     # java-parser wrapper
โ”‚   โ”‚       โ”‚   โ””โ”€โ”€ extractor.ts  # Java symbol extraction
โ”‚   โ”‚       โ””โ”€โ”€ typescript/
โ”‚   โ”‚           โ”œโ”€โ”€ index.ts      # TypeScript/JavaScriptLanguagePlugin
โ”‚   โ”‚           โ”œโ”€โ”€ parser.ts     # typescript-estree wrapper
โ”‚   โ”‚           โ””โ”€โ”€ extractor.ts  # TS/JS symbol extraction
โ”‚   โ””โ”€โ”€ storage/
โ”‚       โ””โ”€โ”€ json-symbol-store.ts  # Symbol index
โ”œโ”€โ”€ types/
โ”‚   โ”œโ”€โ”€ mcp.ts
โ”‚   โ”œโ”€โ”€ progress.ts
โ”‚   โ”œโ”€โ”€ project.ts
โ”‚   โ”œโ”€โ”€ cache.ts
โ”‚   โ”œโ”€โ”€ symbols.ts
โ”‚   โ””โ”€โ”€ analysis.ts               # Analysis types (Phase 3)
โ””โ”€โ”€ utils/
    โ”œโ”€โ”€ progress-writer.ts        # Progress tracking
    โ””โ”€โ”€ mode-detector.ts          # CLI vs MCP detection

tests/
โ”œโ”€โ”€ unit/                         # 184 passing tests
โ”‚   โ”œโ”€โ”€ mcp/server.test.ts        # 6 tests
โ”‚   โ”œโ”€โ”€ agents/
โ”‚   โ”‚   โ”œโ”€โ”€ discovery.test.ts     # 4 tests
โ”‚   โ”‚   โ”œโ”€โ”€ cache.test.ts         # 5 tests
โ”‚   โ”‚   โ”œโ”€โ”€ snippets.test.ts      # 7 tests
โ”‚   โ”‚   โ”œโ”€โ”€ symbols.test.ts       # 23 tests (Java)
โ”‚   โ”‚   โ”œโ”€โ”€ search.test.ts        # 11 tests
โ”‚   โ”‚   โ”œโ”€โ”€ analysis.test.ts      # 11 tests
โ”‚   โ”‚   โ”œโ”€โ”€ semantic.test.ts      # Tests for semantic search
โ”‚   โ”‚   โ”œโ”€โ”€ vcs.test.ts           # Tests for VCS operations
โ”‚   โ”‚   โ””โ”€โ”€ watcher.test.ts       # Tests for file watcher
โ”‚   โ”œโ”€โ”€ language/
โ”‚   โ”‚   โ”œโ”€โ”€ detector.test.ts      # Language detection tests
โ”‚   โ”‚   โ”œโ”€โ”€ registry.test.ts      # Plugin registry tests
โ”‚   โ”‚   โ”œโ”€โ”€ java.test.ts          # Java plugin tests
โ”‚   โ”‚   โ””โ”€โ”€ typescript.test.ts    # 21 tests (TypeScript/JavaScript)
โ”‚   โ””โ”€โ”€ storage/
โ”‚       โ””โ”€โ”€ json-symbol-store.test.ts  # 5 tests
โ”œโ”€โ”€ integration/                  # 12 passing tests
โ”‚   โ”œโ”€โ”€ smoke.test.ts             # 5 tests
โ”‚   โ””โ”€โ”€ multi-language.test.ts    # 12 tests (Multi-Language Integration)
โ””โ”€โ”€ fixtures/
    โ”œโ”€โ”€ gradle-projects/simple/   # Gradle test fixtures
    โ”œโ”€โ”€ java/                     # Java test files
    โ””โ”€โ”€ typescript/               # TypeScript/JavaScript test files

๐Ÿ“– Usage

As CLI Tool

Project & Files:

# Show project information
codeweaver info

# Read entire file
codeweaver file read src/core/service.ts

# Read file with line numbers
codeweaver file read src/core/service.ts --numbers

# Read file with token limit
codeweaver file read src/core/service.ts --limit 500

# Read specific lines (1-indexed, inclusive)
codeweaver file range src/core/service.ts 10 20

# Get context around line (default: ยฑ5 lines)
codeweaver file context src/core/service.ts 42
codeweaver file context src/core/service.ts 42 --context 10

Symbols (Phase 2):

# Index entire project
codeweaver symbols index

# Find symbols by name (case-insensitive)
codeweaver symbols find "UserService"
codeweaver symbols find "get"  # Finds all getXxx methods

# Get specific symbol by qualified name
codeweaver symbols get "com.example.UserService"
codeweaver symbols get "com.example.UserService#findById"

# List all symbols of a kind
codeweaver symbols list class
codeweaver symbols list method
codeweaver symbols list field

Search (Phase 2):

# Search for keyword
codeweaver search keyword "TODO"
codeweaver search keyword "processData"

# Case-insensitive search
codeweaver search keyword "exception" --case-insensitive
codeweaver search keyword "exception" -i

# Search with context lines
codeweaver search keyword "TODO" --context 3 -c 3

# Limit results
codeweaver search keyword "public" --max-results 10 -m 10

# Filter by file extensions
codeweaver search keyword "interface" --extensions .java .ts -e .java -e .ts

# Find files by pattern
codeweaver search files "*.java"
codeweaver search files "*Test.java"
codeweaver search files "User*.ts"

As MCP Server

1. Configure MCP Client

Add to your MCP configuration (e.g., Claude Desktop):

{
  "mcpServers": {
    "codeweaver": {
      "command": "node",
      "args": [
        "/absolute/path/to/mcp-workbench/dist/index.js",
        "--mcp"
      ],
      "cwd": "/path/to/your/java/project"
    }
  }
}

Or use npm:

{
  "mcpServers": {
    "codeweaver": {
      "command": "npm",
      "args": ["run", "dev", "--", "--mcp"],
      "cwd": "/absolute/path/to/mcp-workbench"
    }
  }
}

2. Available MCP Tools (10 total)

Project & Files:

project.meta - Get project metadata

// Input: {} (no parameters)
// Output: ProjectMetadata
{
  "name": "my-project",
  "version": "1.0.0",
  "javaVersion": "21",
  "gradleVersion": "8.5",
  "modules": [...],
  "dependencies": [...],
  "plugins": [...]
}

file.read - Read file with token limit

// Input: { filePath: string, maxTokens?: number }
await mcp.call('file.read', {
  filePath: 'src/main/java/com/example/App.java',
  maxTokens: 5000
});

file.readRange - Read specific lines

// Input: { filePath: string, startLine: number, endLine: number }
await mcp.call('file.readRange', {
  filePath: 'src/main/java/com/example/App.java',
  startLine: 10,
  endLine: 30
});

file.readWithNumbers - Read with line numbers

// Input: { filePath: string }
await mcp.call('file.readWithNumbers', {
  filePath: 'src/main/java/com/example/App.java'
});
// Output: "  1: package com.example;\n  2: \n  3: public class App { ... }"

Symbols (Phase 2):

symbols.index - Index entire project

// Input: {} (no parameters)
await mcp.call('symbols.index', {});
// Output: { files: 15, symbols: 234, classes: 12, classList: [...] }

symbols.find - Find symbols by name

// Input: { name: string }
await mcp.call('symbols.find', {
  name: 'UserService'
});
// Output: SymbolDefinition[]

symbols.findByKind - Find symbols by kind

// Input: { kind: 'class' | 'method' | 'field' | 'constructor' }
await mcp.call('symbols.findByKind', {
  kind: 'method'
});
// Output: SymbolDefinition[]

symbols.get - Get symbol by qualified name

// Input: { qualifiedName: string }
await mcp.call('symbols.get', {
  qualifiedName: 'com.example.UserService#findById'
});
// Output: SymbolDefinition

Search (Phase 2):

search.keyword - Search for keyword in files

// Input: { keyword: string, caseSensitive?: boolean, maxResults?: number, contextLines?: number, fileExtensions?: string[] }
await mcp.call('search.keyword', {
  keyword: 'TODO',
  caseSensitive: false,
  maxResults: 50,
  contextLines: 2,
  fileExtensions: ['.java', '.ts']
});
// Output: SearchResult[] with file, line, column, content, beforeContext, afterContext

search.files - Find files by pattern

// Input: { pattern: string }
await mcp.call('search.files', {
  pattern: '*Test.java'
});
// Output: string[] (file paths)

๐Ÿงช Testing

# Run all tests
npm test

# Run tests in CI mode (no watch)
npm test -- --run

# Run specific test file
npm test -- tests/unit/agents/snippets.test.ts

Test Coverage:

  • โœ… MCP Server (6 tests)
  • โœ… Discovery Agent (4 tests)
  • โœ… Cache Agent (5 tests)
  • โœ… Snippets Agent (7 tests)
  • โœ… Symbol Storage (5 tests)
  • โœ… Symbols Agent (23 tests - Java)
  • โœ… Language Plugins (21 tests - TypeScript/JavaScript, 13 tests - Markdown)
  • โš ๏ธ Python Plugin (18 tests - skipped due to WASM config)
  • โœ… Search Agent (11 tests)
  • โœ… Analysis Agent (11 tests)
  • โœ… Semantic Agent (tests for vector search)
  • โœ… VCS Agent (tests for Git operations)
  • โœ… File Watcher Agent (tests for file watching)
  • โœ… Integration Tests (17 tests: 5 smoke + 12 multi-language)
  • Total: 256 passing (100%)

๐Ÿ”ง Development

Prerequisites

  • Node.js >= 20.0.0
  • TypeScript 5.7+
  • Java JDK 21 (for target projects)
  • Gradle (optional, wrapper preferred)

Setup

# Clone repository
git clone <repository-url>
cd mcp-workbench

# Install dependencies
npm install

# Build
npm run build

# Development mode (with auto-reload)
npm run build:watch

# Run in dev mode (no build required)
npm run dev

Scripts

npm run build           # Compile TypeScript
npm run build:watch     # Watch mode
npm run dev             # Run with tsx (no build)
npm test                # Run tests (watch mode)
npm run lint            # ESLint
npm run format          # Prettier
npm run clean           # Remove dist & cache
npm run validate-links  # Validate markdown links ๐Ÿ†•

Documentation Tools:

npm run validate-links           # Validate internal links (~2s)
npm run validate-links:external  # Include external links (~30s)
npm run validate-links:verbose   # Detailed output

See LINK_VALIDATION.md for details.


๐Ÿ“Š Token Efficiency

CodeWeaver is designed to minimize token usage when providing code context to LLMs:

Strategies

  1. Line Ranges: Only send requested line ranges, not entire files
  2. Token Limits: Automatic truncation to configurable limits (default: 10k)
  3. Smart Truncation: Respects word boundaries when truncating
  4. Token Counting: Simple heuristic (~4 chars/token) for quick estimates
  5. Context Windows: Provide minimal context around specific lines

Token Estimation

Content Type Typical Size Tokens (approx)
Small snippet (20 lines) ~1 KB ~250
Medium snippet (80 lines) ~4 KB ~1000
Large snippet (200 lines) ~10 KB ~2500
Project metadata ~2 KB ~500

Max Response Size: 10,000 tokens (~40 KB text)


๐Ÿ—บ๏ธ Roadmap

โœ… Phase 1: Foundation (Complete - 100%)

  • โœ… MCP Server skeleton with tool registration
  • โœ… Progress tracking (JSON Lines)
  • โœ… Discovery Agent (Gradle metadata)
  • โœ… Cache Agent (content-addressable storage)
  • โœ… Symbol Storage (JSON Lines persistence)
  • โœ… Core Service (shared logic)
  • โœ… CLI Interface with commands
  • โœ… MCP Interface with stdio
  • โœ… Build & Test Setup (32 tests passing)
  • โœ… Snippets Agent with token limits
  • โœ… Documentation (complete)
  • โœ… Integration Tests (5 smoke tests)

โœ… Phase 2: Indexing (Complete - 100%)

  • โœ… Symbols Agent (java-parser, symbol extraction)
  • โœ… Search Agent (keyword + pattern search)
  • โœ… Project-wide indexing (classes, methods, fields, constructors)
  • โœ… Symbol search (by name, kind, qualified name)
  • โœ… File search (glob patterns with * and ?)
  • โœ… Context search (lines before/after matches)
  • โœ… MCP Tools integration (6 new tools)
  • โœ… CLI Commands integration (symbols, search)
  • โœ… Full test coverage (19 new tests)
  • โœ… Documentation update

Note: LanceDB semantic search deferred to later phase as enhancement

โœ… Phase 3: Analysis (Complete - 100%)

  • โœ… Analysis Agent (complexity & metrics calculation)
  • โœ… Cyclomatic Complexity calculation (if, loops, catch, &&, ||, ?:)
  • โœ… Code Metrics (LOC, SLOC, comments, blank lines)
  • โœ… Import analysis
  • โœ… Method call detection
  • โœ… Project-wide statistics (total complexity, average, top N files)
  • โœ… MCP Tools integration (2 new tools)
  • โœ… CLI Commands integration (analysis)
  • โœ… Full test coverage (11 new tests)
  • โœ… Documentation update

Note: Static analysis tools (SpotBugs, Checkstyle) and Gradle runner deferred

โœ… Phase 4: VCS Integration (Complete - 100%)

  • โœ… VCS Agent (Git operations)
  • โœ… Repository status (modified, added, deleted, untracked files)
  • โœ… Diff generation (file-level and project-level)
  • โœ… Blame information (line-by-line authorship)
  • โœ… Commit history (with filtering options)
  • โœ… Branch management (list, compare)
  • โœ… MCP Tools integration (6 new tools)
  • โœ… CLI Commands integration (vcs)
  • โœ… Full test coverage (11 new tests)
  • โœ… Documentation update

๐Ÿ“‹ Phase 5: Orchestration (Planned)

  • Orchestrator Agent (DAG-based pipeline)
  • Parallel task execution
  • Dependency resolution

๐Ÿ› Troubleshooting

Tests failing?

# Clean and reinstall
npm run clean
rm -rf node_modules package-lock.json
npm install
npm test -- --run

Build errors?

# Check TypeScript version
npx tsc --version  # Should be 5.7+

# Rebuild
npm run clean
npm run build

MCP server not responding?

# Check if running in MCP mode
npm run dev -- --mcp

# Verify stdio transport
echo '{}' | npm run dev -- --mcp

CLI not working?

# Ensure TTY mode (not piped)
npm run dev -- info

# Check built binary
node dist/index.js info

๐Ÿ“ Progress Tracking

View live progress during implementation:

# Bash/Git Bash
tail -f .codeweaver/progress.jsonl

# PowerShell
Get-Content .codeweaver\progress.jsonl -Wait

# Read checkpoint
cat .codeweaver/checkpoint.json

๐Ÿค Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Write tests for new features
  4. Ensure all tests pass: npm test -- --run
  5. Follow code style (ESLint + Prettier)
  6. Submit a pull request

Code Style

  • TypeScript Strict Mode: All type errors must be resolved
  • ESM Modules: Use .js extensions in imports
  • Test-Driven Development: Write tests first
  • No Unused Variables: Clean code, no warnings

๐Ÿ“„ License

MIT License - see LICENSE


๐Ÿ“š Documentation

Vollstรคndige Dokumentation in docs/

๐Ÿ—บ๏ธ Schnellzugriff

Komplette Navigation: docs/INDEX.md - Vollstรคndiger Dokumentations-Index mit Navigation nach Rolle

Beliebte Dokumente:

๐Ÿ”— Externe Links


๐ŸŽฏ Current Status

Alpha Release v0.1.0 โš ๏ธ

โœ… Working Features:

  • MCP Server with 19 tools (project, files, symbols, search, analysis, vcs)
  • CLI with 7 command groups (includes watch mode)
  • Multi-Language Support - Java, TypeScript, JavaScript, Markdown, Python with plugin architecture
  • Semantic Search with ONNX Runtime optimizations
  • Multi-Collection Support (Code + Docs)
  • File Watcher for automatic index updates
  • Symbol Extraction - Complete support for Java, TypeScript, JavaScript, Markdown, and Python
  • Code Quality Analysis - Cyclomatic complexity, LOC metrics
  • Git Integration - Status, diff, blame, log, branches
  • 291 tests passing (100% - all features tested)

โš ๏ธ Known Limitations:

  • Performance varies on large codebases (>10k files)
  • Semantic search memory usage can be high
  • File watcher may miss rapid changes
  • Documentation is incomplete
  • Breaking changes expected in future releases

๐Ÿ”ฎ Planned Improvements:

  • Python WASM configuration โœ… COMPLETED
  • GPU acceleration for semantic search
  • Better error messages
  • More language support (Go, Rust, C#, etc.)
  • Performance optimizations
  • Comprehensive documentation

๐Ÿ’ก Philosophy

CodeWeaver follows these principles:

  1. Token Efficiency First - Never overwhelm LLMs with entire files
  2. Zero Native Dependencies - Pure Node.js for portability
  3. Test-Driven Development - Tests before implementation
  4. Dual Interface - Same codebase serves CLI and MCP
  5. Progressive Enhancement - Working foundation, build up from there

Built with โค๏ธ for the LLM-assisted development workflow.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured