forgemcp

forgemcp

Quality-aware code intelligence that turns GitHub search into ranked, explainable, import-ready recommendations, enabling developers to find the best code implementations with archetype clustering and provenance-based import.

Category
Visit Server

README

<div align="center">

๐Ÿ”ฅ GeniusMCP

Quality-aware code intelligence that turns GitHub search into ranked, explainable, import-ready recommendations.

Node 20+ Tests License MIT MCP

<br/>

Not another grep. An intelligence layer.

Quick Start ยท How It Works ยท Soul ยท Tools ยท Architecture

GeniusMCP in action

</div>


The Problem

Every code search tool answers "where is this string?"

None of them answer "what is the best implementation, why, and can I safely use it?"

When you ask genius.hunt("retry with backoff"), GeniusMCP returns:

Archetype 1 โ€” Minimal inline helper
  โœ… 12 LOC, zero deps, copy-paste ready
  Exemplar: owner/repo โ€” score 0.87 (battle_tested)
  Why: test-adjacent, MIT license, 3 years stable

Archetype 2 โ€” Configurable utility
  โœ… Options-driven, max attempts + jitter strategy
  Exemplar: owner/repo2 โ€” score 0.82
  Why: 14K stars, active maintenance, comprehensive docs

Archetype 3 โ€” Middleware pattern
  โœ… Express/Fastify compatible, interceptor-based
  Exemplar: owner/repo3 โ€” score 0.79
  Tradeoff: framework-coupled

Coverage: 3 sources searched, 2 blind spots, confidence: 0.83

That's the gap GeniusMCP fills.


โœจ Key Features

Feature What it does
๐ŸŽฏ Archetype Search Finds 3-5 structural families, not 200 raw matches
๐Ÿ“Š 6-Bucket Quality Scoring queryFit ยท durability ยท vitality ยท importability ยท codeQuality ยท evidenceConfidence
๐Ÿ” Multi-Source Discovery grep.app (free, 1M repos) + GitHub Code Search (200M repos) + searchcode (75B lines)
๐Ÿงฌ 3-Level Dedup Exact SHA โ†’ normalized AST hash โ†’ winnowing fingerprint families
๐Ÿ“œ Provenance-First Import License gate ยท dependency closure ยท policy checks ยท attribution
๐Ÿง  Persistent Memory Every search enriches local evidence graph. Session 50 is smarter than session 1.
๐Ÿช Auto-Capture Hooks Claude Code hooks capture patterns from every file you read/write
๐Ÿ’‰ Pre-Prompt Injection Relevant memories injected BEFORE the AI responds
๐Ÿ—๏ธ 7 Archetype Categories minimal ยท configurable ยท middleware ยท context-aware ยท distributed ยท enterprise ยท wrapper
๐Ÿ“‹ Transparent Uncertainty Every result shows blind spots + evidence confidence
โšก Tiered Responses L1 (80 tokens) / L2 (300) / L3 (2000) โ€” adaptive detail level per result count
๐Ÿ›ก๏ธ Circuit Breakers Per-source fault isolation: GitHub/grep.app/searchcode fail independently
๐ŸŽฐ Thompson Sampling Multi-armed bandit learns which sources produce best results per query type
๐Ÿ”ค SAC Matching getUserSession finds get_user_session โ€” cross-convention identifier similarity
๐Ÿ“ฆ Signature Compression Repomix-style 70% token reduction โ€” strips bodies, keeps signatures
๐Ÿ” Dynamic Discovery forge_discover("search code") โ€” find tools by intent, not memorize 28 names

๐Ÿ† Why GeniusMCP

GitHub MCP grep.app MCP DeusData GeniusMCP
Multi-source search 1 source 1 source local only 3 sources
Quality scoring no no no 6-bucket
License verification no no no yes
Import with provenance no no no yes
Persistent memory no no knowledge graph Bayesian + decay
Cross-convention matching no no no SAC algorithm
Fault tolerance no no no circuit breakers
Token efficiency no no no L1/L2/L3 tiers

๐Ÿš€ Quick Start

# 1. Clone and install
git clone https://github.com/geniussigmaskibidi-gif/geniusmcp
cd geniusmcp && pnpm install && pnpm build

# 2. Optional: GitHub auth (enables GitHub Code Search + metadata)
export GITHUB_TOKEN=ghp_your_token

Add to Claude Code (.mcp.json in your project root)

{
  "mcpServers": {
    "forgemcp": {
      "command": "node",
      "args": ["/path/to/forgemcp/apps/mcp-server/dist/index.js"],
      "env": { "GITHUB_TOKEN": "ghp_your_token" }
    }
  }
}

Server auto-indexes your project on start. code.reach, code.map, code.symbols work immediately.

Optional: Claude Code Hooks (auto-capture + injection)

{
  "hooks": {
    "PostToolUse": [
      { "matcher": "Read|Write|Edit", "command": "node hooks/genius-capture.js" }
    ],
    "UserPromptSubmit": [
      { "command": "node hooks/genius-inject.js" }
    ]
  }
}

๐Ÿ’ก Usage Examples

Find the best implementation of a concept

You: "Find me a good rate limiter implementation"
Agent calls: genius.hunt("rate limiter", language: "typescript", tier: "L1")
โ†’ 5 ranked archetypes in 130 tokens, with stars/license/test signals

Import code with license verification

You: "Import that circuit breaker from the best result"
Agent calls: import.extract("owner/repo", "src/circuit-breaker.ts", symbol: "CircuitBreaker")
โ†’ Full code + MIT license verified + provenance hash + attribution comment

Compare approaches across repos

You: "Should I use Zod or Ajv for validation?"
Agent calls: research.deep_compare("validation", ["colinhacks/zod", "ajv-validator/ajv"])
โ†’ Side-by-side: Zod 42K stars vs Ajv 14K, both MIT+CI, structured quality signals

Remember and recall across sessions

Session 1: genius.hunt("retry backoff") โ†’ auto-stores top 3 results
Session 2: memory.recall("retry") โ†’ instant recall, no API calls needed

Explore unfamiliar repository

You: "How does Hono handle errors?"
Agent calls: research.archaeology("honojs/hono", "error handling")
โ†’ Found .onError() handler, JWT error middleware, 29K stars, TypeScript

Read soul.md for the complete AI agent reasoning guide โ€” search strategies, anti-patterns, and token budget optimization.


๐Ÿ”„ How It Works

graph TD
    Q["genius.hunt('retry backoff')"] --> C[QueryCompiler]
    C --> |grep.app queries| G[grep.app MCP]
    C --> |GitHub queries| GH[GitHub Code Search]
    C --> |hydration queries| SC[searchcode.com]

    G --> D[Dedup Engine]
    GH --> D
    SC --> D

    D --> |"180 hits โ†’ 60 blobs"| E[Symbol Extractor]
    E --> F[Winnowing Fingerprint]
    F --> |"60 โ†’ 8 families"| CL[Archetype Classifier]
    CL --> R[Quality Scorer]
    R --> |"6-bucket ranking"| OUT["3-5 ranked archetypes<br/>with explanations"]

    OUT --> MEM[(Evidence Graph<br/>SQLite)]
    MEM --> |"next search: instant recall"| Q

The Magic Loop

Session 1: "Find best rate limiter" โ†’ searches 3 sources โ†’ 60 unique blobs โ†’ 5 archetypes
           โ†’ Results cached in evidence graph

Session 2: "Rate limiter for Express" โ†’ local memory: 40 instant hits + 20 new
           โ†’ Faster, smarter, more relevant

Session 10: "Throttle middleware" โ†’ 120 cached patterns, <100ms response
            โ†’ Compound intelligence

๐Ÿ› ๏ธ Tools (28 MCP Tools)

๐ŸŽฏ Hunt Intelligence (flagship)

Tool Description
genius.hunt Find best implementations with archetype clustering, quality scoring, coverage report
genius.explain Full signal breakdown: why this ranked #1
genius.compare Head-to-head comparison with bucket deltas
genius.import Policy-aware import with provenance manifest

๐Ÿง  Memory (compound intelligence)

Tool Description
memory.recall Search past patterns by concept
memory.store Save pattern to persistent memory
memory.evolve Create improved version linked to parent
memory.related Find connected patterns
memory.link Create relationships between patterns
memory.stats Memory size, coverage, confidence distribution
memory.forget Remove outdated patterns

๐Ÿงญ Code Navigation (1 call = 10 Read/Greps)

Tool Description
code.reach Jump to symbol with full context: callers, callees, deps
code.map Instant project architecture map
code.trace Call chain between functions
code.understand Compressed module understanding
code.symbols All exports with signatures

๐Ÿ”ฌ Research (persistent reasoning chains)

Tool Description
research.archaeology Trace code evolution
research.deep_compare Structured comparison with metrics
research.start_chain Begin research thread
research.add_step Record reasoning step
research.conclude Mark chain completed
research.recall_chain Search past research

๐Ÿ™ GitHub

Tool Description
github.search_repos Search by query, language, stars
github.search_code Code search across GitHub
github.repo_overview Stars, CI, license, health
github.repo_file Get file content
github.repo_tree Recursive file tree

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   GeniusMCP Server                   โ”‚
โ”‚                                                     โ”‚
โ”‚  Layer 1: DISCOVERY                                 โ”‚
โ”‚    grep.app MCP ยท GitHub Code Search API            โ”‚
โ”‚                                                     โ”‚
โ”‚  Layer 2: HYDRATION                                 โ”‚
โ”‚    GitHub Trees/Contents ยท searchcode analysis      โ”‚
โ”‚                                                     โ”‚
โ”‚  Layer 3: EVIDENCE GRAPH                            โ”‚
โ”‚    SourceHit โ†’ Blob โ†’ SymbolSlice โ†’ PatternFamily   โ”‚
โ”‚                                                     โ”‚
โ”‚  Layer 4: PATTERN INTELLIGENCE                      โ”‚
โ”‚    3-level dedup ยท archetype classifier ยท scorer    โ”‚
โ”‚                                                     โ”‚
โ”‚  Layer 5: IMPORT & POLICY                           โ”‚
โ”‚    License gate ยท provenance ยท dep closure          โ”‚
โ”‚                                                     โ”‚
โ”‚  Layer 6: EVALUATION                                โ”‚
โ”‚    Coverage confidence ยท blind spots ยท metrics      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Quality Scoring (RFC v2)

overall = 0.35 ร— queryFit + 0.50 ร— qualityComposite + 0.15 ร— evidenceConfidence

qualityComposite = weights[preset] ร— {durability, vitality, importability, codeQuality}

Presets: battle_tested ยท modern_active ยท minimal_dependency ยท teaching_quality

Hard caps: snippet_only โ†’ evidence โ‰ค 0.60 ยท archived โ†’ vitality โ‰ค 0.20 ยท license_unknown โ†’ importability โ‰ค 0.20


๐Ÿ“ฆ Monorepo Structure

forgemcp/
  packages/
    core/              โ€” Types, config, errors (Zod-validated)
    db/                โ€” SQLite WAL, blob store, search index, evidence graph
    ast-intelligence/  โ€” Symbol extraction, call graph, architecture detection
    repo-memory/       โ€” Bayesian confidence + Ebbinghaus decay engine
    github-gateway/    โ€” Octokit + 4-bucket rate governor + ETag cache
    data-sources/      โ€” grep.app + searchcode + source orchestrator
    hunt-engine/       โ€” Winnowing, clustering, scoring, archetype classifier
    importer/          โ€” License policy + provenance + style adaptation
  apps/
    mcp-server/        โ€” MCP server + 5 skill modules + hook daemon + dynamic tools
  hooks/               โ€” Claude Code auto-capture scripts
  tests/               โ€” 252 tests (vitest)
  .github/workflows/   โ€” CI (Node 20/22, build + typecheck + test)

๐Ÿงช Testing

npx vitest run
# 22 test suites, 252 tests, all passing (<1s)
Suite Tests What it covers
foundation 19 ForgeResult, Logger, Health, Context
blob-store 10 Content-addressable storage, dedup, file refs
blob-lifecycle 11 GC, pinning, integrity scrub
symbol-extractor 13 TypeScript, Python, Go extraction + fingerprinting
parser-registry 6 Multi-backend precision routing
search-index 4 FTS5 trigram, BM25, RRF fusion
simhash 14 Near-duplicate detection, Hamming distance
chunker 8 Semantic code chunking, symbol boundaries
query-planner 14 Query classification, lane planning
ranking-v2 13 BM25F weights, retrieval scoring, lexical+structural
memory-engine 15 Store, recall, capture, Bayesian confidence, Ebbinghaus decay
memory-v2 6 L1/L2/L3 capsule builder, token estimation
call-graph 9 2-pass resolution, BFS reachability, path tracing
winnowing 12 Fingerprints, Jaccard similarity, clone clustering
policy-engine 11 4-mode import policy, license gates, provenance
evidence-graph 8 v2 schema: query runs, slices, families, versioned scores
job-queue 10 Durable job queue, priority, backoff, dead-letter
circuit-breaker 19 Circuit breaker state machine, bulkhead, resilient search
token-budget 22 Token estimation, tier selection, truncation, compression
source-selector 5 Thompson Sampling, convergence, discounting
early-terminator 6 Welford online stats, adaptive saturation
sac 17 Subword Affine Canonicalization, cross-convention matching

๐ŸŽฏ Design Principles

  1. Evidence, not opinions โ€” every score has signals you can inspect
  2. Local-first โ€” works offline for indexed repos
  3. Zero ML in core โ€” lexical + structural, semantic is opt-in
  4. Provenance always โ€” every import traced to source + license
  5. Progressive learning โ€” every search enriches the evidence graph
  6. Transparent uncertainty โ€” blind spots shown, not hidden

๐Ÿ“Š Tech Stack

Component Technology
Protocol MCP SDK 1.28 (stdio + Streamable HTTP)
Database SQLite (WAL mode, better-sqlite3)
Search FTS5 trigram + BM25F + Reciprocal Rank Fusion
AST Regex multi-language + ast-grep upgrade path
Dedup Winnowing fingerprints (Schleimer 2003) + Jaccard clustering
GitHub Octokit + throttling + retry + 4-bucket rate governor
External grep.app MCP + searchcode.com
Validation Zod
Resilience Circuit breakers + bulkheads + decorrelated jitter
Ranking SAC cross-convention matching + Thompson Sampling source routing
Token Efficiency L1/L2/L3 tiered responses + signature compression
Tests Vitest (252 tests, <1s)
Monorepo pnpm + Turborepo

๐Ÿ“„ License

MIT


<div align="center">

Built for AI agents that never forget.

Report Bug ยท Request Feature ยท Discussions

</div>

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured