cc-sensei
Distills Claude Code's 310K-line source into 32 queryable modules via MCP, enabling AI agents to learn and reuse Claude Code's architectural patterns through natural language.
README
<div align="center">
π cc-sensei
Inject every ounce of Claude Code's architectural wisdom into your AI's brain
<img src="assets/social-preview.png" alt="cc-sensei: 310K lines of source distilled into 32 queryable modules via MCP" width="100%" />
</div>
Sound familiar?
π© "I've already dug deep into Claude Code's source, but when I ask my AI to help me build an Agent, it has no idea what I'm talking about β what I read, it didn't."
π« "I'd love to build a knowledge base so my AI can learn from Claude Code, but the moment I open it β 300K+ lines, 1,250 files β both my AI and I are scared off."
π΅ "I've barely looked at Claude Code's source, but I still want an Agent just as powerful β where do I even begin?"
If any of these hit home β this project is built for you.
What it does for you
Agent Architecture Oracle distills Claude Code's 310K lines of source and 1,250 files into 32 core modules with 40K lines of structured analysis, and serves it directly to your AI through the MCP protocol.
It covers every core capability of Claude Code:
π Agent main loop Β· π οΈ Tool system & execution pipeline Β· π‘οΈ Permissions & security sandbox Β· π Model API & streaming π§ Context engineering & compaction Β· π File / Shell / Git tools Β· π MCP protocol Β· π¨ Ink rendering engine Β· π§© Skills/plugins πΎ Persistent memory Β· β‘ Prompt cache break detection Β· π€ Sub-agents & task system β¦ (22 core + 10 deep-dive supplements)
It establishes a complete index pipeline: natural language β module β design decision β reusable pattern β corresponding source code:
You say one thing Oracle automatically does five things
βββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββ
"How do you compact long βββΊ β Match natural language to M06 (Context Engineering)
conversations?" β‘ Return the 4-tier compaction architecture & 9-section summary prompt
β’ List 5 directly-copyable engineering principles
β£ Cross-trace the 12 modules touched by prompt cache
β€ Pull up services/compact/microCompact.ts source on demand
After plugging it into Claude Desktop / Cursor / Qoder / your own Agent, your AI gains a "Chief Architect of Claude Code" as its consultant β whether you're cloning the whole thing or just stealing one design, it tells you exactly where to look, why it works, and how to copy it.
In one night, your AI becomes a world-class Agent architect.
π Quick Start (3 steps)
Option A β Try instantly via npx (zero install)
npx cc-sensei
This downloads and runs the server directly. Skip to Step 3 to connect your AI.
Option B β Clone for full control
Step 1 β Clone the project
git clone https://github.com/contradictory-body/cc-sensei.git && cd cc-sensei
Step 2 β Install and build
Copy-paste again:
pnpm install && pnpm build
π‘ Don't have pnpm? Run
npm install -g pnpmfirst. π‘ This step parses 32 module analyses, generates the knowledge index, and compiles the server. One-time only.
When done, you should see: β
module-registry.json (32 modules) and β
Build success.
Step 3 β Connect your AI
Add this snippet to your MCP client's configuration file (replace the path with the one you just cloned to):
{
"mcpServers": {
"cc-sensei": {
"command": "node",
"args": ["/your/absolute/path/cc-sensei/dist/server.js"]
}
}
}
Don't know where the config file is? Common locations:
| Client | Config file path |
|---|---|
| Claude Desktop (macOS) | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Claude Desktop (Windows) | %APPDATA%\Claude\claude_desktop_config.json |
| Cursor | Settings β MCP β Add new server |
| Qoder | Settings β MCP β Edit mcpServers |
Restart your client. Done! Your AI just gained 6 new skills.
π― Try it out
Just say to your AI:
"Use cc-sensei to tell me how Claude Code optimizes prompt cache."
It will call the tools on its own and return a complete analysis with source-code references. That's it.
Dive deeper
π οΈ What the 6 tools do
| Tool | One-line capability | When to use |
|---|---|---|
list_modules |
List all 32 modules and their concerns | "Which modules does Claude Code break down into?" |
query_architecture |
Natural-language search with 3 depth levels (brief/standard/deep) | "How do they prevent long-conversation context overflow?" |
get_module |
Drill into one module's specific section (responsibility / architecture / decisions / principles / relations) | "Show me M06's design principles" |
trace_concern |
Trace one concern across all 32 modules, ranked by hit count | "Which modules touch 'prompt cache'?" |
search_patterns |
Bulk-extract reusable patterns with built-in size limiting | "Give me every cache-related design I can copy" |
get_source_code |
Read Claude Code source directly (with line numbers, ranges, and directory listings) | "Show me services/api/claude.ts:1412-1456" |
Every tool comes with:
- β
Path-prefix tolerance β both
src/...and bare paths are accepted; no more copy-paste failures - β Result rate-limiting β 800 chars per section and 12 sections by default, so your AI's context window stays alive
- β
Path-traversal protection β attacks like
../../../etc/passwdare rejected outright - β Graceful degradation β bad arguments return friendly hints instead of crashes
π¬ Real scenario: "I want to add context compaction to my Agent"
You: "How do you compact conversation history to avoid hitting the token limit?"
Oracle (query_architecture, 103ms):
β Hit M06: Context Engineering
β Returns the 4-tier compaction architecture comparison table
(microCompact / sessionMemoryCompact / autoCompact / reactiveCompact)
β Trigger conditions, whether LLM is invoked, key file locations β all listed
You: "Show me M06's key design principles"
Oracle (get_module section=principles, 102ms):
β "Sticky-on Beta Headers β performance over simplicity"
"Single message-level cache_control marker (max 4)"
"9-section summary structure β eval-driven prompt"
β Each principle includes: where it lives, code snippet, why, how to reuse, the trade-off
You: "How many modules does 'prompt cache' span?"
Oracle (trace_concern, 118ms):
β 12 modules ranked by hit count
β Each with 5 lines of file:line excerpts
You: "Show me src/services/compact/microCompact.ts"
Oracle (get_source_code, 99ms):
β Source with line numbers + total line count
A complete "understand β copy β ship" loop, averaging ~110ms.
π¦ Knowledge scale
π 32 module analyses (40K lines of distilled insight covering 310K lines of source)
βββ M01-M22 Core modules
β M01 Process bootstrap & lifecycle Β· M02 Agent main loop Β· M03 Tool system
β M04 Permissions & security Β· M05 Model API & streaming Β· M06 Context engineering
β M07 File/Shell/Git tools Β· M08 MCP protocol Β· M09 LSP integration
β M10 Bridge/IPC Β· M11 Ink rendering engine Β· M12 Message rendering
β M13 Input system Β· M14 Sub-agents & tasks Β· M15 Skills/plugins
β M16 Command system Β· M17 Configuration & settings Β· M18 Telemetry & analytics
β M19 State management Β· M20 Hotkeys & focus Β· M21 Buddy/voice Β· M22 Testability
βββ SUPP-* 10 deep-dive supplements
AgentSummary Β· SessionMemory Β· autoDream Β· ColorDiff
extractMemories Β· fileIndex Β· largeFiles Β· speculation
teamMemorySync Β· yogaLayout
ποΈ Indexes: 390 keywords / 104 concerns / full section-line mappings
π» Full source code: claude-code-main/src/ available for direct local reads
ποΈ Engineering philosophy
This project is dogfooding itself β built using the methods Claude Code teaches, made to serve Claude.
| Claude Code's design philosophy | How this project applies it |
|---|---|
| Build-time indexing + zero runtime analysis | Section line ranges, keywords, and concerns are all computed in pnpm build:index; runtime just looks up JSON |
| Path-traversal gatekeeping | validatePath runs before every file read |
| Graceful degradation > unhandled crashes | Section not found? Return the "available types" hint |
| Structured prompt-as-spec | All 32 MODULE_NOTES strictly follow a 10-section template |
| Rate-limiting by default | search_patterns defaults to 800 chars per section, 12 sections cap, configurable |
β‘ Performance
p50 = 105ms p95 = 118ms max = 125ms
Average response size: 7K chars
Fully local, 0 network dependency
β Tests
pnpm exec tsx scripts/test-suite.ts # 30 baseline functional tests
pnpm exec tsx scripts/regression-3fixes.ts # 11 UX-fix regression tests
pnpm exec tsx scripts/e2e-comprehensive.ts # 24 real-scenario E2E tests
| Test suite | Cases | Result |
|---|---|---|
| Baseline functionality (6 tools Γ multiple branches) | 30 | β 30/30 |
| UX-fix regression | 11 | β 11/11 |
| Real user-scenario E2E (3 personas) | 24 | β 24/24 |
| Total | 65 | β 65/65 |
π Project structure
cc-sensei/
βββ src/
β βββ server.ts # MCP server entry
β βββ retrieval/engine.ts # Retrieval engine
β βββ source-reader.ts # Source reading + path safety
β βββ taxonomy.ts # 32-module mapping table
β βββ tools/ # 6 MCP tools
βββ scripts/
β βββ build-index.ts # Parses MODULE_NOTES into indexes
β βββ test-suite.ts
β βββ regression-3fixes.ts
β βββ e2e-comprehensive.ts
βββ knowledge/ # Build artifacts
β βββ module-registry.json
β βββ keyword-index.json
β βββ concern-map.json
βββ claude-code-main/ # Claude Code source + 32 module analyses
β βββ src/ # β Source
β βββ MODULE_NOTES/ # β Analyses
βββ dist/server.js # Bundle output
βοΈ Advanced configuration
| Env var | Purpose | Default |
|---|---|---|
CC_SOURCE_ROOT |
Claude Code source root | <project>/claude-code-main/src |
MODULE_NOTES_ROOT |
Module-analysis directory | <project>/claude-code-main/MODULE_NOTES |
Point these to anywhere else and Oracle becomes a knowledge server for any project β as long as that project has analyses written to the same MODULE_NOTES template.
Roadmap
- [ ] Incremental indexing (only re-parse changed MODULE_NOTES)
- [ ] More section types (performance / security / observability)
- [ ] HTTP / SSE transport (in addition to stdio)
- [ ] Multi-codebase support (mount several projects at once)
License
MIT
<div align="center">
Can't read 310K lines of source? Let 40K lines of distilled analysis read it for you, and let your AI copy from it.
If this project helped you, drop a β so more Agent developers can find it.
</div>
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.