LBrain MCP Server
Enables AI agents to perform hybrid semantic+keyword search over structured markdown lairs and memory files, with tools for protocol checks and anti-pattern detection.
README
NeurAInetic Layered Brain Harness
LBrain — AI-native engineering memory on the Lair Protocol. By Metavolve Labs.
LBrain indexes structured markdown lairs and memory files and gives an AI agent fast hybrid (semantic + keyword) search over everything you've chosen to remember — surfacing what's stored, never editorializing.
Why
Existing "RAG" tools index text and forget structure. The lair protocol is structure — priority hierarchies, wikilink graphs, frontmatter types, governance cadence. LBrain reads those signals and treats them as first-class retrieval inputs. The protocol is the product; the search engine just respects it.
What it does
- Hybrid retrieval — BM25 (SQLite FTS5) + cosine (sqlite-vec) fused by Reciprocal Rank Fusion, then wikilink graph boost + priority-folder boost + supersession-aware de-ranking + frontmatter-type filter.
- Call-when-needed precision / recency — opt-in per query (
rerank=Truefor precise lookups;recency=Truefor "latest on X"). Off by default — both are situational (measured: rerank helps precise lookups but hurts broad coverage), so they're enabled per call, not globally. - Always-on core memory — an optional curated, user-authored block injected ahead of results (the essentials are always present); gated and token-budgeted by the AMP layer.
- Prompt-injection containment — retrieved note text is fenced and framed as untrusted data, never instructions, before it reaches the agent.
- Lair Protocol check —
should_commit_to_lair(text)decides what's worth saving so you don't have to think about it. - Anti-pattern detection — cross-checks proposed actions against your saved
feedback_*.mdrules. - Onboarding flow — three-minute questionnaire scaffolds CLAUDE.md + starter priority lairs.
- MCP server — direct integration with Claude Code (
claude mcp add -s user lbrain -- /path/to/lbrain-mcp).
Stack
- Python 3.10+
- SQLite + sqlite-vec + FTS5 (native, no WASM, no daemon)
- OpenAI text-embedding-3-small (~$0.12 per 6M-token corpus; pennies on updates)
- the official
mcpSDK (FastMCP) for the MCP server - ~2,750 LOC core + ~1,370 LOC optional Tier-2 archive subpackage. No moving parts.
Install
cd lbrain
pip install -e . # lean core (index → embed → search → MCP)
# pip install -e ".[rerank]" # + call-when-needed cross-encoder precision pass
# pip install -e ".[archive]" # + encrypted Tier-2 archive
# pip install -e ".[arweave]" # + real permaweb (Arweave L1) writes
# Initialize config + DB
lbrain init --api-key=$OPENAI_API_KEY \
--source=/path/to/your/lairs \
--source=/path/to/your/memory
# Walk + ingest
lbrain import
# Embed
lbrain embed --stale
Use
# Hybrid semantic search
lbrain query "how do we sign C2PA"
# Filter by frontmatter type
lbrain query "code style" --type feedback
# Priority lairs only
lbrain query "current quarter goals" --priority
# Pure keyword (no embedding call, sub-50ms)
lbrain search "snake_case lock"
# "Should I save this?"
lbrain commit-check "user said: don't auto-format imports in this repo"
# "Does this action conflict with anything I've been told?"
lbrain check-action "going to mock the database for these tests"
# Brain stats
lbrain stats
Onboard a new project
lbrain onboard ~/repos/new-project
Three minutes of opinionated questions → working CLAUDE.md + three priority lairs + LAIR_RULES.md.
Register MCP with Claude Code
chmod +x /path/to/lbrain/scripts/lbrain-mcp
claude mcp add -s user lbrain -- /path/to/lbrain/scripts/lbrain-mcp
Tools surfaced: lair_query, lair_search, lair_protocol_check, lair_check_action, lair_stats.
Containerized deployment — for autonomous agents
For agents running in containers / Kubernetes / outside Claude Code, run LBrain as an HTTP MCP service:
# Local (no container): bind 127.0.0.1 unless you front it with authenticated ingress
# (the server has no built-in auth). Use --host 0.0.0.0 only inside a trusted network.
lbrain mcp --transport streamable-http --host 127.0.0.1 --port 7370
# Docker:
docker build -t lbrain .
# Use a NAMED volume (brain-data) — the container runs as non-root (uid 10001) and a
# host bind mount would inherit host ownership, breaking writes to brain.db. Bind the
# published port to localhost — the MCP server has NO built-in auth, so never publish it
# on a public interface (`-p 7370:7370`); for remote access put an authenticated,
# TLS-terminating reverse proxy in front.
docker run --rm -p 127.0.0.1:7370:7370 -v brain-data:/data \
-e OPENAI_API_KEY=$OPENAI_API_KEY lbrain
# docker-compose (Kite Apprentice / Maestro pattern):
docker compose -f docker-compose.kite.yml up
⚠️ The streamable-http MCP server exposes the full tool surface (the whole memory corpus is readable) with no authentication. Run it only inside a trusted container network or behind authenticated ingress — never directly on the public internet.
The agent's MCP client connects to http://lbrain:7370/mcp and gets the same 5 tools. Use this for:
- Recall for autonomous loops — agent calls
lair_queryat decision points to pull the relevant stored context (no editorializing — just what's saved). - Anti-pattern guarding — agent calls
lair_check_actionbefore destructive actions; the savedfeedback_*.mdrules become an automatic safety net. - Lair-Protocol output capture — agent calls
lair_protocol_checkon session outputs to decide what's save-worthy, eliminating the "remind me to remember this" round trip with the user.
Two-brain pattern (recommended for production agents)
- Shared brain (read-only mount of your full corpus) — global recall + anti-pattern coverage.
- Task-specific brain (per-agent, curated subset) — focused, cheaper, faster. E.g. a Buildathon agent gets only the relevant submission/spec/integration lairs.
Both can run in the same container with different brain.db files via LBRAIN_HOME switching, or as separate containers the agent queries in parallel.
See docker-compose.kite.yml for a full Apprentice + LBrain wiring example.
Architecture
lbrain/
├── index.py File walker + frontmatter + chunker + wikilink extractor
├── embed.py OpenAI embeddings client (batched, stateless)
├── store.py SQLite + sqlite-vec + FTS5 storage layer
├── search.py Hybrid BM25 + cosine (RRF) + graph/priority/supersession boosts
├── rerank.py Optional cross-encoder precision pass (call-when-needed)
├── amp.py Injection gating, token budgeting, provenance, core memory + fence
├── lair_protocol.py commit-check heuristic + feedback anti-pattern detector
├── onboard.py Interactive scaffolding for new projects
├── mcp_server.py MCP tool surface (FastMCP)
├── cli.py click CLI entry point
├── config.py ~/.lbrain/config.toml
└── archive/ OPTIONAL Tier-2 subpackage (install via lbrain[archive])
├── archiver.py encrypt → transport (local/Arweave) → snapshot → index
├── crypto.py AES-256-GCM + Argon2id envelopes + per-item crypto-shred
├── storage.py archive tables + queries (lazy schema, shared connection)
├── cli.py archive/capture/recall/retrieve/shred commands (register hook)
└── mcp.py lair_deep_recall tool (register hook)
The archive/ subpackage has a strict one-way dependency on the core (it imports core;
core never imports it except through guarded, lazy registration). pip install lbrain
gives the lean retrieval engine; pip install lbrain[archive] adds the encrypted Tier-2
archive; pip install lbrain[arweave] adds real permaweb writes. Drop the extra (or the
directory) and the core runs unchanged — the archive CLI commands and the
lair_deep_recall MCP tool simply don't register.
Truth hierarchy
Source files (markdown lairs and memory entries) are authoritative. The SQLite index is a derivative cache. If they disagree, trust the file and run lbrain import && lbrain embed --stale.
Acknowledgements
The injection layer (amp.py) implements patterns from Tate Berenbaum's AMP — Augmented Memory Protocol.
License
BSD 3-Clause — see LICENSE. Copyright (c) 2026 Metavolve Labs, Inc.
Metavolve Labs, Inc. — Build the infrastructure of memory for the AI age.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.