flaiwheel
Self-hosted memory and governance layer for AI coding agents. 28 MCP tools with hybrid search, structured knowledge capture, behavioral nudges, and git-native storage. Zero cloud dependencies.
README
Flaiwheel
Self-hosted memory & governance layer for AI coding agents. Turn every bug fix into permanent knowledge. Zero cloud. Zero lock-in.
π Why Flaiwheel Exists
AI coding agents forget everything between sessions. That leads to repeated bugs, lost architectural decisions, and knowledge decay.
Flaiwheel ensures:
- Agents search before coding
- Agents document after fixing
- Commits automatically capture knowledge
- Memory compounds over time
Every bug fixed makes the next bug cheaper.
π§ How Flaiwheel Is Different
- Persistent AI Memory That Compounds β knowledge doesn't reset between sessions.
- Git-Native Automation β commits automatically become structured knowledge.
- Governance, Not Just Storage β quality gates + enforced documentation.
- Hybrid Search + Reranking β high-precision context for real codebases.
- Fully Self-Hosted β single Docker container, no external infrastructure.
- Zero Lock-In β all knowledge stored as structured flat files in Git.
β Who Flaiwheel Is For
- Engineering teams using AI coding assistants in real projects
- Codebases where repeated bugs are expensive
- Teams requiring full data control
- AI-native development environments
β Not For
- Small hobby projects under a few thousand lines
- Developers who just want better autocomplete
- Pure SaaS workflows with no interest in self-hosting
π Where Flaiwheel Fits
- AI coding tools generate code.
- RAG tools retrieve documents.
- Flaiwheel governs and compounds structured engineering knowledge inside your own infrastructure.
It does not replace your AI assistant. It makes it reliable at scale.
π Whitepaper (PDF) β Vision, architecture, and design in depth.
βοΈ Key Technical Features
Flaiwheel is a self-contained Docker service that operates on three levels:
Pull β agents search before they code (search_docs, get_file_context)
Push β agents document as they work (write_bugfix_summary, write_architecture_doc, β¦)
Capture β git commits auto-capture knowledge via a post-commit hook, even without an AI agent
- Indexes your project documentation (
.md,.pdf,.html,.docx,.rst,.txt,.json,.yaml,.csv) into a vector database - Provides an MCP server that AI agents (Cursor, Claude Code, VS Code Copilot) connect to
- Hybrid search β combines semantic vector search with BM25 keyword search via Reciprocal Rank Fusion (RRF) for best-of-both-worlds retrieval
- Cross-encoder reranker β optional reranking step that rescores candidates with a cross-encoder model for significantly higher precision on vocabulary-mismatch queries
- Behavioral Directives β AI agents silently search Flaiwheel before every response, auto-document after every task, and reuse before recreating β all without being asked
get_file_context(filename)β pre-loads spatial knowledge for any file the agent is about to edit (complementsget_recent_sessionsfor full temporal + spatial context)- post-commit git hook β captures every
fix:,feat:,refactor:,perf:,docs:commit as a structured knowledge doc automatically - Living Architecture β AI agents are instructed to maintain self-updating Mermaid.js diagrams for system components and flows
- Executable Test Flows β test scenarios are documented in machine-readable BDD/Gherkin format (
Given,When,Then) for QA automation - Learns from bugfixes β agents write bugfix summaries that are instantly indexed
- Structured write tools β 7 category-specific tools (bugfix, architecture, API, best-practice, setup, changelog, test case) that enforce quality at the source
- Pre-commit validation β
validate_doc()checks freeform markdown before it enters the knowledge base - Ingest quality gate β files with critical issues are automatically skipped during indexing (never deleted β you own your files)
- Auto-syncs via Git β pulls AND pushes to a dedicated knowledge repo
- Tool telemetry (persistent) β tracks every MCP call per project (searches, writes, misses, patterns), detects knowledge gaps, and nudges agents to document β persisted across restarts and visible in the Web UI
- Impact metrics API β
/api/impact-metricscomputes estimated time saved + regressions avoided; CI pipelines can post guardrail outcomes to/api/telemetry/ci-guardrail-report - Proactive quality checks β automatically validates knowledge base after every reindex
- Knowledge Bootstrap β "This is the Way": analyse messy repos, classify files, detect duplicates, propose a cleanup plan, execute with user approval (never deletes files)
- Cold-Start Codebase Analyzer β
analyze_codebase(path)scans a source code directory entirely server-side (zero tokens, zero cloud). Uses Python's built-inastmodule for Python, regex for TypeScript/JavaScript, the existing MiniLM embedding model for classification and duplicate detection. Returns a singlebootstrap_report.mdwith language distribution, category map, top 20 files to document first ranked by documentability score, duplicate pairs, and coverage gaps. Reduces cold-start token cost by ~90% on legacy codebases. - Multi-project support β one container manages multiple knowledge repos with per-project isolation
- Includes a Web UI for configuration, monitoring, and testing
Whatβs New in v3.9.28
- Glama / MCP stdio fix β all diagnostic output moved to stderr; stdout is now JSON-RPC only. Glama Inspector now detects all 28 tools correctly.
- Improved cold-start detection β stdio cold-start logic handles empty Docker volumes correctly (no bootstrap / model download during Glama inspection).
Previous: v3.9.27
- License cleanup β one
LICENSEfile (BSL 1.1) for correct GitHub/Glama detection; all docs and headers point toLICENSE(notLICENSE.md). - Glama / stdio inspection β optional
[inspect]deps and cold-start stdio path for lightweight MCP directory builds.
Previous: v3.9.26
- Claude Cowork skill β the Flaiwheel workflow is now distributed as a native Claude skill. The installer writes
.skills/skills/flaiwheel/SKILL.mdto your project. When you open the project in Claude (Cowork), the skill is auto-available β no extra setup needed. The skill drives session-start context restore, pre-coding knowledge search, mandatory post-bugfix documentation, and session-end summarisation. - Skill source also committed to
skills/flaiwheel/SKILL.mdin this repo for reference and manual install.
Previous: v3.9.25
- WSL2 automatic pre-flight setup β WSL2 is now detected automatically and a dedicated pre-flight block runs before the main installer flow. No manual steps required:
- Switches
iptablesto legacy backend (fixes Docker networking / DNAT errors) - Adds the current user to the
dockergroup (no morepermission denied) - Starts the Docker daemon via
service(no systemd on WSL2) - Adds a Docker auto-start snippet to
~/.bashrc(idempotent, runs on every WSL2 login)
- Switches
- Scattered WSL2 checks throughout the script consolidated into the single pre-flight block.
Previous: v3.9.24
- Fix: auto-install python3 if missing β the installer uses
python3extensively for JSON manipulation. On minimal Linux/WSL2 systems without python3, config file writes silently failed (/dev/fd/63: line N: python3: command not found). python3 is now checked as prerequisite #0 and auto-installed via apt/dnf/yum/pacman/brew if missing.
Previous: v3.9.23
- Fix: Docker daemon start on WSL2 with iptables-legacy β Docker on WSL2 often fails to start silently because the default
iptables-nftbackend is not supported. The installer now switches toiptables-legacyviaupdate-alternativesbefore starting Docker. Also adds the current user to thedockergroup automatically. - All install commands updated to
bash <(curl ...)β every displayed install/re-run command throughout the script (error messages, AGENTS.md, Cursor rules, etc.) now uses process substitution to avoid WSL2 pipe issues.
Previous: v3.9.22
- Fix:
curl | bashpipe write failures on WSL2 βcurl | bashcan fail withcurl: (23) Failure writing outputon WSL2 due to pipe/tmp permission issues. The primary install command in README is nowbash <(curl ...)(process substitution), which avoids the pipe entirely. The re-exec block also tries$HOMEas a fallback temp dir when/tmpwrites fail. Error message explicitly recommends thebash <(curl ...)form.
Previous: v3.9.21
- Fix: sudo guard moved before re-exec block β when
sudo curl | bashwas used, thecurl: (23)pipe error truncated the script before the previous sudo guard (which was after colors/functions) was ever reached. The guard is now the very first executable line (set -euo pipefailaside), so it fires even on a truncated download. Duplicate guard after colors removed.
Previous: v3.9.20
- Fix: Docker daemon startup poll on WSL2 β instead of a fixed 5-second sleep, the installer now polls
docker infoevery 2 seconds for up to 30 seconds afterservice docker start. Also shows the actual output ofservice docker startso startup errors are visible instead of silently swallowed.
Previous: v3.9.19
- Fix: Docker daemon start on WSL2 β WSL2 typically has no
systemd, sosystemctl start dockersilently failed. The installer now detects WSL2 via/proc/versionand usessudo service docker startinstead. If Docker still isn't running after install, a clear WSL2-specific error is shown with the exact fix command and a tip to add it to~/.bashrcfor auto-start on login.
Previous: v3.9.18
- Fix: block
sudo curl | bashandsudo bash install.shβ running the installer as root viasudobreaks GitHub CLI authentication:gh authstores credentials in/root/.config/gh/instead of the real user's home, making every subsequentghcall fail. Also causedcurl: (23) Failure writing outputpipe errors on WSL. The installer now detectsSUDO_USERat startup and exits immediately with a clear message telling the user to re-run withoutsudo. Privilege escalation for package installs is handled internally.
Previous: v3.9.17
- Fix:
gh auth loginmust not be run with sudo β after auto-installingghon Linux/WSL, the installer now explicitly tells the user to rungh auth loginwithoutsudo. If auth was previously done withsudo, credentials ended up in/root/.config/gh/and were invisible to the current user, causing the auth check to fail. The error messages at both the post-install and the auth-check step now clearly warn: do not use sudo forgh auth.
Previous: v3.9.16
- Fix: installer works on WSL and non-root Linux β all Linux package manager commands (
apt-get,dnf,yum,zypper,pacman), Docker convenience script, andsystemctlcalls now automatically usesudowhen the installer is not running as root. Root installs are unaffected. FixesPermission denied/ lock file errors on WSL and standard Linux desktop users.
Previous: v3.9.15
- Cold-start report cached in
/data/βanalyze_codebase()saves the report to/data/coldstart-<project>.mdafter the first run. Subsequent calls return the cached report instantly (<1s). The installer also writes the cache during install so the very first MCP call by any agent is instant. Call withforce=Trueto regenerate after major codebase changes. analyze_codebase()in all agent Session Setup templates βAGENTS.md,.cursor/rules/flaiwheel.mdc,CLAUDE.md, and.github/copilot-instructions.mdall now include it as step 3 of Session Setup. Agents automatically get the codebase overview before starting work.- Cold-start prompt asked before Docker rebuild β all interactive questions (embedding model + cold-start) are now batched upfront, then the rebuild runs unattended.
- Fix: used
docker execfor cold-start β replaced broken HTTP calls to the MCP SSE endpoint with directdocker exec python3. Analysis now works reliably in ~20s.
Previous: v3.9.14
- Fix: fast-path always prompts for cold-start β no more silent skip when cached report exists.
Previous: v3.9.13
- Improved cold-start classification β two-pass classifier: path heuristics first, code-specific embedding templates as fallback.
Previous: v3.9.12
- Fix: y/n answer respected before cache check β explicit
ynow always re-runs analysis even when cached report exists.
Previous: v3.9.11
- Fix: coldstart functions in global scope β moved
_run_coldstart/_do_coldstart_analysisto top of script so fast-path can call them.
Previous: v3.9.10
- Fix: version check β
LATEST_VERSIONnow uses_FW_VERSIONdirectly, no CDN fetch.
Previous: v3.9.9
- Fix: cold-start on all paths β
_run_coldstart()called from fast-path, update, and fresh install. Smart cache detection.
Previous: v3.9.8
- Cold-start report caching β
analyze_codebase()cached to/data/coldstart-<project>.mdfor instant reads. Newforce=Trueparam.
Previous: v3.9.7
- Agent Session Setup β all instruction templates now include
analyze_codebase()as a first-session step.
Previous: v3.9.6
- Fix: use docker exec β replaced broken HTTP calls to MCP SSE endpoint with direct
docker exec python3invocation. Cold-start report now actually works (~20s).
Previous: v3.9.5
- Fix: warm up embedding model β added model warm-up before cold-start analysis (superseded by v3.9.6).
Previous: v3.9.4
- Fix: cold-start retries while model loads β installer now retries
analyze_codebase()for up to 90s after container starts.
Previous: v3.9.3
- Fix: update detection always checks
mainβLATEST_VERSIONnow fetched frommainbranch so stale cached installers no longer silently skip updates.
Previous: v3.9.2
- Cold-start prompt moved before Docker rebuild β all interactive questions now batched upfront.
Previous: v3.9.1
- Cold-start prompt moved upfront β the
install.shcold-start question is now asked right after the embedding model selection (before the Docker rebuild), so all interactive questions are gathered first and the user never misses the prompt after a long rebuild.
Previous: v3.9.0
analyze_codebase(path)β new 28th MCP tool for zero-token cold-start analysis of legacy codebases. Runs entirely server-side in Docker. Uses Pythonast, regex, MiniLM embeddings, and nearest-centroid classification. Returns a rankedbootstrap_report.mdwith language distribution, category map, top 20 files by documentability score, near-duplicate pairs, and recommended next steps. Reduces cold-start token cost by βΌ90%.
Previous: v3.8.3
- No auto-index on project add β adding a project via the web UI no longer immediately pulls and embeds the knowledge repo. Indexing is now deferred until explicitly triggered (βGit Pull + Reindexβ or
reindex()MCP tool), keeping the vector DB clean until the repo has been reviewed.
Previous: v3.6.x
- VS Code / GitHub Copilot support β installer writes
.vscode/mcp.jsonand.github/copilot-instructions.md. - Claude Desktop support β installer auto-configures Claude Desktop via
mcp-remote. - Web UI Client Configuration panel β VS Code and Claude Code CLI tabs added.
Previous: v3.5.x
- Claude Desktop + Claude Code CLI support added.
- README strategically rewritten with positioning, target audience, and competitive framing.
Previous: v3.4.x
- Search miss rate fix β
search_bugfixescalls no longer inflate miss rate above 100%. - Classification consistency β
_path_category_hintunified token-based approach across all categories. CHANGELOG.mdadded to repo root.
Quick Start β One Command (recommended)
Prerequisites: GitHub CLI authenticated (gh auth login), Docker running.
Platform support: macOS and Linux work out of the box. On Windows, run the installer from WSL or Git Bash (Docker Desktop must be running with WSL 2 backend enabled).
Run this from inside your project directory:
bash <(curl -sSL https://raw.githubusercontent.com/dl4rce/flaiwheel/main/scripts/install.sh)
WSL2 / Linux note: Use the
bash <(curl ...)form above β it avoidscurl: (23)pipe write errors that occur withcurl | bashon some WSL2 setups. Never prefix withsudo.
That's it. The installer automatically:
- Detects your project name and GitHub org from the git remote
- Creates a private
<project>-knowledgerepo with the standard folder structure - Starts the Flaiwheel Docker container pointed at that repo
- Configures Cursor β writes
.cursor/mcp.jsonand.cursor/rules/flaiwheel.mdc - Configures VS Code / GitHub Copilot β writes
.vscode/mcp.json(native SSE, VS Code 1.99+) and.github/copilot-instructions.md - Configures Claude Desktop (macOS app) β writes
claude_desktop_config.jsonviamcp-remotebridge (requires Node.js) - Configures Claude Code CLI β writes
.mcp.json+CLAUDE.mdand runsclaude mcp addautomatically if the CLI is on PATH - Installs Claude Cowork skill β writes
.skills/skills/flaiwheel/SKILL.mdso the full Flaiwheel workflow is available as a native Claude skill - Writes
AGENTS.mdfor all other agents - If existing
.mddocs are found, creates a migration guide β the AI will offer to organize them into the knowledge repo
After install:
| Agent | What to do |
|---|---|
| Cursor | Restart Cursor β Settings β MCP β enable flaiwheel toggle |
| Claude Desktop (macOS app) | Quit and reopen Claude for Mac β hammer icon appears when connected |
| Claude Code CLI | Already registered automatically β run /mcp inside Claude Code to verify |
| VS Code | Open project β Command Palette β MCP: List Servers β start flaiwheel |
| Claude (Cowork) | Skill auto-loads from .skills/skills/flaiwheel/SKILL.md β no further action needed |
The installer also sets up a post-commit git hook that automatically captures every fix:, feat:, refactor:, perf:, and docs: commit as a structured knowledge doc β no agent or manual action required.
Once connected, the AI has access to all Flaiwheel tools. If you have existing docs, tell the AI: "migrate docs".
Updating
Run the same install command again from your project directory:
bash <(curl -sSL https://raw.githubusercontent.com/dl4rce/flaiwheel/main/scripts/install.sh)
The installer detects the existing container, asks for confirmation, then:
- Rebuilds the Docker image with the latest code
- Recreates the container (preserves your data volume + config)
- Refreshes all agent configs and guides
Your knowledge base, index, and credentials are preserved β only the code is updated.
Manual Setup
<details> <summary>Click to expand manual steps</summary>
1. Create a knowledge repo
# On GitHub, create: <your-project>-knowledge (private repo)
mkdir -p architecture api bugfix-log best-practices setup changelog
echo "# Project Knowledge Base" > README.md
git add -A && git commit -m "init" && git push
2. Build and start Flaiwheel
git clone https://github.com/dl4rce/flaiwheel.git /tmp/flaiwheel-build
docker build -t flaiwheel:latest /tmp/flaiwheel-build
docker run -d \
--name flaiwheel \
-p 8080:8080 \
-p 8081:8081 \
-e MCP_GIT_REPO_URL=https://github.com/you/yourproject-knowledge.git \
-e MCP_GIT_TOKEN=ghp_your_token \
-v flaiwheel-data:/data \
flaiwheel:latest
3. Connect your AI agent
Cursor β add to .cursor/mcp.json:
{
"mcpServers": {
"flaiwheel": {
"type": "sse",
"url": "http://localhost:8081/sse"
}
}
}
VS Code / GitHub Copilot (1.99+) β add to .vscode/mcp.json:
{
"servers": {
"flaiwheel": {
"type": "sse",
"url": "http://localhost:8081/sse"
}
}
}
Then: Command Palette β MCP: List Servers β start flaiwheel.
Claude Desktop (macOS app) β add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"flaiwheel": {
"command": "npx",
"args": ["-y", "mcp-remote", "http://localhost:8081/sse"]
}
}
}
Requires Node.js. Restart Claude for Mac after editing.
Claude Code CLI β run once in your project directory:
claude mcp add --transport sse --scope project flaiwheel http://localhost:8081/sse
4. Done. Start coding.
</details>
Knowledge Repo Structure
yourproject-knowledge/
βββ README.md β overview / index
βββ architecture/ β system design, decisions, diagrams
βββ api/ β endpoint docs, contracts, schemas
βββ bugfix-log/ β auto-generated bugfix summaries
β βββ 2026-02-25-fix-payment-retry.md
βββ best-practices/ β coding standards, patterns
βββ setup/ β deployment, environment setup
βββ changelog/ β release notes
βββ tests/ β test cases, scenarios, regression patterns
Supported Input Formats
Flaiwheel indexes 9 file formats. All non-markdown files are converted to markdown-like text in memory at index time β no generated files on disk, no repo clutter.
| Format | Extension(s) | How it works |
|---|---|---|
| Markdown | .md |
Native (pass-through) |
| Plain text | .txt |
Wrapped in # filename heading |
.pdf |
Text extracted per page via pypdf |
|
| HTML | .html, .htm |
Headings/lists/code converted to markdown, scripts stripped |
| reStructuredText | .rst |
Heading underlines converted to # levels, code blocks preserved |
| Word | .docx |
Paragraphs + heading styles mapped to markdown |
| JSON | .json |
Pretty-printed in fenced json code block |
| YAML | .yaml, .yml |
Wrapped in fenced yaml code block |
| CSV | .csv |
Converted to markdown table |
Quality checks (structure, completeness, bugfix format) apply only to .md files. Other formats are indexed as-is.
Configuration
All config via environment variables (MCP_ prefix), Web UI (http://localhost:8080), or .env file.
| Variable | Default | Description |
|---|---|---|
MCP_DOCS_PATH |
/docs |
Path to .md files inside container |
MCP_EMBEDDING_PROVIDER |
local |
local (free, private) or openai |
MCP_EMBEDDING_MODEL |
all-MiniLM-L6-v2 |
Embedding model name |
MCP_CHUNK_STRATEGY |
heading |
heading, fixed, or hybrid |
MCP_RERANKER_ENABLED |
false |
Enable cross-encoder reranker for higher precision |
MCP_RERANKER_MODEL |
cross-encoder/ms-marco-MiniLM-L-6-v2 |
Reranker model name |
MCP_RRF_K |
60 |
RRF k parameter (lower = more weight on top ranks) |
MCP_RRF_VECTOR_WEIGHT |
1.0 |
Vector search weight in RRF fusion |
MCP_RRF_BM25_WEIGHT |
1.0 |
BM25 keyword search weight in RRF fusion |
MCP_MIN_RELEVANCE |
0 |
Minimum relevance % to return (0 = no filter) |
MCP_GIT_REPO_URL |
Knowledge repo URL (enables git sync) | |
MCP_GIT_BRANCH |
main |
Branch to sync |
MCP_GIT_TOKEN |
GitHub token for private repos | |
MCP_GIT_SYNC_INTERVAL |
300 |
Pull interval in seconds (0 = disabled) |
MCP_GIT_AUTO_PUSH |
true |
Auto-commit + push bugfix summaries |
MCP_WEBHOOK_SECRET |
GitHub webhook secret (enables /webhook/github HMAC verification) |
|
MCP_TRANSPORT |
sse |
MCP transport: sse or stdio |
MCP_SSE_PORT |
8081 |
MCP SSE endpoint port |
MCP_WEB_PORT |
8080 |
Web UI port |
Multi-Repo Support
A single Flaiwheel container can manage multiple knowledge repositories β one per project. Each project gets its own ChromaDB collection, git watcher, index lock, health tracker, and quality checker, while sharing one embedding model in RAM and one MCP/Web endpoint.
How it works:
- The first
install.shrun creates the Flaiwheel container with project A - Subsequent
install.shruns from other project directories detect the running container and register the new project via the API β no additional containers - All MCP tools accept an optional
projectparameter (e.g.,search_docs("query", project="my-app")) - Call
set_project("my-app")at the start of every conversation to bind all subsequent calls to that project (sticky session) - Without an explicit
projectparameter, the active project (set viaset_project) is used; if none is set, the first project is used - The Web UI has a project selector dropdown to switch between projects
- Use
list_projects()via MCP to see all registered projects (shows active marker)
Adding/removing projects:
- Via AI agent: call
setup_project(name="my-app", git_repo_url="...")β registers, clones, indexes, and auto-binds - Via install script: run
install.shfrom a new project directory (auto-registers) - Via Web UI: click "Add Project" in the project selector bar
- Via API:
POST /api/projectswith{name, git_repo_url, git_branch, git_token} - Remove:
DELETE /api/projects/{name}or the "Remove" button in the Web UI
Backward compatibility: existing single-project setups continue to work without changes. If no projects.json exists but MCP_GIT_REPO_URL is set, Flaiwheel auto-creates a single project from the env vars.
Embedding Model Hot-Swap
When you change the embedding model via the Web UI, Flaiwheel re-embeds all documents in the background using a shadow collection. Search remains fully available on the old model while the migration runs. Once complete, the new index atomically replaces the old one β zero downtime.
The Web UI shows a live progress bar with file count and percentage. You can cancel at any time.
Embedding Models (local, free)
| Model | RAM | Quality | Best for |
|---|---|---|---|
all-MiniLM-L6-v2 |
90MB | 78% | Large repos, low RAM |
nomic-ai/nomic-embed-text-v1.5 |
520MB | 87% | Best English quality |
BAAI/bge-m3 |
2.2GB | 86% | Multilingual (DE/EN) |
Select via Web UI or MCP_EMBEDDING_MODEL env var. Full list in the Web UI.
Cross-Encoder Reranker (optional)
The reranker is a second-stage model that rescores the top candidates from hybrid search. It reads the full (query, document) pair together, which produces much more accurate relevance scores than independent embeddings β especially for vocabulary-mismatch queries where the user and the document use different words for the same concept.
How it works:
- Hybrid search (vector + BM25) retrieves a wider candidate pool (
top_k Γ 5) - RRF merges and ranks the candidates
- The cross-encoder rescores the top candidates and returns only the best
top_k
Enable via Web UI (Search & Retrieval card) or environment variable:
docker run -d \
-e MCP_RERANKER_ENABLED=true \
-e MCP_RERANKER_MODEL=cross-encoder/ms-marco-MiniLM-L-6-v2 \
...
| Reranker Model | RAM | Speed | Quality |
|---|---|---|---|
cross-encoder/ms-marco-MiniLM-L-6-v2 |
90MB | Fast | Good β best speed/quality balance |
cross-encoder/ms-marco-MiniLM-L-12-v2 |
130MB | Medium | Better β higher precision |
BAAI/bge-reranker-base |
420MB | Slower | Best β state-of-the-art accuracy |
The reranker is off by default (zero overhead). When enabled, it adds ~50ms latency per search but typically improves precision by 10-25% on vocabulary-mismatch queries.
GitHub Webhook (instant reindex)
Instead of waiting for the 300s polling interval, configure a GitHub webhook for instant reindex on push:
- In your knowledge repo on GitHub: Settings β Webhooks β Add webhook
- Payload URL:
http://your-server:8080/webhook/github - Content type:
application/json - Secret: set the same value as
MCP_WEBHOOK_SECRET - Events: select "Just the push event"
The webhook endpoint verifies the HMAC signature if MCP_WEBHOOK_SECRET is set. Without a secret, any POST triggers a pull + reindex.
CI Guardrail Telemetry (ROI tracking)
Track non-vanity engineering impact directly in Flaiwheel:
- POST
/api/telemetry/ci-guardrail-reportβ CI reports guardrail findings/fixes per PR - GET
/api/impact-metrics?project=<name>&days=30β returns estimated time saved + regressions avoided
Example payload:
{
"project": "my-app",
"violations_found": 4,
"violations_blocking": 1,
"violations_fixed_before_merge": 2,
"cycle_time_baseline_minutes": 58,
"cycle_time_actual_minutes": 43,
"pr_number": 127,
"branch": "feature/payment-fix",
"commit_sha": "abc1234",
"source": "github-actions"
}
Flaiwheel persists telemetry on disk (<vectorstore>/telemetry) so metrics survive container restarts and updates.
Diff-aware Reindexing
Reindexing is incremental by default β only files whose content changed since the last run are re-embedded. On a 500-file repo, this means a typical reindex after a single-file push takes <1s instead of re-embedding everything.
Use reindex(force=True) via MCP or the Web UI "Reindex" button to force a full rebuild (e.g. after changing the embedding model).
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Docker Container (single process, N projects) β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Web-UI (FastAPI) Port 8080 β β
β β Project CRUD, config, monitoring, search, health β β
β βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ β
β β shared state (ProjectRegistry) β
β βββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββ β
β β MCP Server (FastMCP) Port 8081 β β
β β 28 tools (search, write, classify, manage, projects) β β
β βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββ β
β β Shared Embedding Model (1Γ in RAM) β β
β βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββ β
β β Per-Project Contexts (isolated) β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β β β Project A β β Project B β β Project C β β β
β β β collection β β collection β β collection β β β
β β β watcher β β watcher β β watcher β β β
β β β lock β β lock β β lock β β β
β β β health β β health β β health β β β
β β β quality β β quality β β quality β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β /docs/{project}/ β per-project knowledge repos β
β /data/ β shared vectorstore + config + projects β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Search Pipeline
query
β
ββββΊ Vector Search (ChromaDB/HNSW, cosine similarity)
β fetch top_k (or top_kΓ5 if reranker enabled)
β
ββββΊ BM25 Keyword Search (bm25s, English stopwords)
β fetch top_k (or top_kΓ5 if reranker enabled)
β
ββββΊ RRF Fusion (configurable k, vector/BM25 weights)
β merge + rank candidates
β
ββββΊ [optional] Cross-Encoder Reranker
β rescore (query, doc) pairs for higher precision
β
ββββΊ Min Relevance Filter (configurable threshold)
β
ββββΊ Return top_k results with relevance scores
Web UI
Access at http://localhost:8080 (HTTP Basic Auth β credentials shown on first start).
Features:
- System health panel: last index, last git pull, git commit, version, search metrics, quality score, skipped files count
- Index status and statistics (including reranker status)
- Embedding model selection (visual picker)
- Search & Retrieval tuning: cross-encoder reranker toggle + model picker, RRF weights, minimum relevance threshold
- Chunking strategy configuration
- Git sync settings (URL, branch, auto-push toggle)
- Test search interface
- Knowledge quality checker (also runs automatically after every reindex)
- Search metrics (hits/total, miss rate, per-tool breakdown)
- Skipped files indicator (files excluded from indexing due to critical quality issues)
- "This is the Way" β Knowledge Bootstrap: agent-driven project classification + in-repo cleanup (Web UI shows guidance + advanced scan)
- Multi-project switcher (manage multiple repos from one instance)
- Client configuration snippets (Cursor, Claude Desktop, Docker)
- Password management
Development
# Clone
git clone https://github.com/dl4rce/flaiwheel.git
cd flaiwheel
# Install
pip install -e ".[dev]"
# Run tests (259 tests covering readers, quality checker, indexer, reranker, health tracker, MCP tools, model migration, multi-project, bootstrap, classification, file-context, cold-start analyzer)
pytest
# Run locally (needs /docs and /data directories)
mkdir -p /tmp/flaiwheel-docs /tmp/flaiwheel-data
MCP_DOCS_PATH=/tmp/flaiwheel-docs MCP_VECTORSTORE_PATH=/tmp/flaiwheel-data python -m flaiwheel
License
Business Source License 1.1 (BSL 1.1)
Flaiwheel is source-available under the Business Source License 1.1.
You may use Flaiwheel for free if:
- Your use is non-commercial (personal, educational, no revenue), or
- Your organization has no more than 10 individuals using it
Commercial use beyond these limits (e.g., teams of 11+ or commercial deployment) requires a paid license.
- Effective 2030-02-25, this version converts to Apache License 2.0 (fully open source)
- Commercial licenses: info@4rce.com | https://4rce.com
See LICENSE for full terms.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.