MCP Servers

brainlayer

Local-first persistent memory layer for AI agents. Provides hybrid search (FTS5 keyword + vector embeddings) over 223K+ knowledge chunks via MCP. Tools: brain_search, brain_store, brain_entity, brain_subscribe. Features pub/sub with stable agent identity, delivery tracking, and Claude --channels integration. SQLite + BrainBar Swift daemon on Unix socket.

README

BrainLayer

Persistent memory and knowledge graph for AI agents — 9 MCP tools, real-time indexing hooks, and a native macOS daemon for always-on recall across every conversation.

224,000+ chunks indexed · 1,002 Python + 28 Swift tests · Real-time indexing hooks · 9 MCP tools · BrainBar daemon (209KB) · Zero cloud dependencies

Your AI agent forgets everything between sessions. Every architecture decision, every debugging session, every preference you've expressed — gone. You repeat yourself constantly.

BrainLayer fixes this. It's a local-first memory layer that gives any MCP-compatible AI agent the ability to remember, think, and recall across conversations. Includes BrainBar — a 209KB native macOS daemon that provides always-on memory access.

"What approach did I use for auth last month?"     →  brain_search
"Show me everything about this file's history"     →  brain_recall
"What was I working on yesterday?"                 →  brain_recall
"Remember this decision for later"                 →  brain_store
"Ingest this meeting transcript"                   →  brain_digest
"What do we know about this person?"               →  brain_get_person
"Look up the Domica project entity"                →  brain_entity

Quick Start

pip install brainlayer
brainlayer init              # Interactive setup wizard
brainlayer index             # Index your Claude Code conversations

Then add to your editor's MCP config:

Claude Code (~/.claude.json):

{
  "mcpServers": {
    "brainlayer": {
      "command": "brainlayer-mcp"
    }
  }
}

<details> <summary>Other editors (Cursor, Zed, VS Code)</summary>

Cursor (MCP settings):

{
  "mcpServers": {
    "brainlayer": {
      "command": "brainlayer-mcp"
    }
  }
}

Zed (settings.json):

{
  "context_servers": {
    "brainlayer": {
      "command": { "path": "brainlayer-mcp" }
    }
  }
}

VS Code (.vscode/mcp.json):

{
  "servers": {
    "brainlayer": {
      "command": "brainlayer-mcp"
    }
  }
}

</details>

That's it. Your agent now has persistent memory across every conversation.

Architecture

graph LR
    A["Claude Code / Cursor / Zed"] -->|MCP| B["BrainLayer MCP Server<br/>9 tools"]
    B --> C["Hybrid Search<br/>semantic + keyword (RRF)"]
    C --> D["SQLite + sqlite-vec<br/>single .db file"]
    B --> KG["Knowledge Graph<br/>entities + relations"]
    KG --> D

    E["Claude Code JSONL<br/>conversations"] --> F["Pipeline"]
    F -->|extract → classify → chunk → embed| D
    G["Local LLM<br/>Ollama / MLX"] -->|enrich| D

    H["Real-time Hooks"] -->|live per-message| D
    I["BrainBar<br/>macOS daemon"] -->|Unix socket MCP| B

Everything runs locally. No cloud accounts, no API keys, no Docker, no database servers.

Component	Implementation
Storage	SQLite + sqlite-vec (single `.db` file, WAL mode)
Embeddings	`bge-large-en-v1.5` via sentence-transformers (1024 dims, runs on CPU/MPS)
Search	Hybrid: vector similarity + FTS5 keyword, merged with Reciprocal Rank Fusion
Enrichment	Local LLM via Ollama or MLX — 10-field metadata per chunk
MCP Server	stdio-based, MCP SDK v1.26+, compatible with any MCP client
Clustering	Leiden + UMAP for brain graph visualization (optional)
BrainBar	Native macOS daemon (209KB Swift binary) — always-on MCP over Unix socket

MCP Tools (9)

Core (4)

Tool	Description
`brain_search`	Semantic search — unified search across query, file_path, chunk_id, filters.
`brain_store`	Persist memories — ideas, decisions, learnings, mistakes. Auto-type/auto-importance.
`brain_recall`	Proactive retrieval — current context, sessions, session summaries.
`brain_tags`	Browse and filter by tag — discover what's in memory without a search query.

Knowledge Graph (5)

Tool	Description
`brain_digest`	Ingest raw content — entity extraction, relations, sentiment, action items.
`brain_entity`	Look up entities in the knowledge graph — type, relations, evidence.
`brain_expand`	Expand a chunk_id with N surrounding chunks for full context.
`brain_update`	Update, archive, or merge existing memories.
`brain_get_person`	Person lookup — entity details, interactions, preferences (~200-500ms).

Backward Compatibility

All 14 old brainlayer_* names still work as aliases.

Enrichment

BrainLayer enriches each chunk with 10 structured metadata fields using a local LLM:

Field	Example
`summary`	"Debugging Telegram bot message drops under load"
`tags`	"telegram, debugging, performance"
`importance`	8 (architectural decision) vs 2 (directory listing)
`intent`	`debugging`, `designing`, `implementing`, `configuring`, `deciding`, `reviewing`
`primary_symbols`	"TelegramBot, handleMessage, grammy"
`resolved_query`	"How does the Telegram bot handle rate limiting?"
`epistemic_level`	`hypothesis`, `substantiated`, `validated`
`version_scope`	"grammy 1.32, Node 22"
`debt_impact`	`introduction`, `resolution`, `none`
`external_deps`	"grammy, Supabase, Railway"

Three enrichment backends (auto-detect: MLX → Ollama → Groq, override via BRAINLAYER_ENRICH_BACKEND):

Backend	Best for	Speed
Groq (cloud)	When local LLMs are unavailable	~1-2s/chunk
MLX (Apple Silicon)	M1/M2/M3 Macs (preferred)	21-87% faster than Ollama
Ollama	Any platform	~1s/chunk (short), ~13s (long)

brainlayer enrich                              # Default backend (auto-detects)
BRAINLAYER_ENRICH_BACKEND=groq brainlayer enrich --batch-size=100

Why BrainLayer?

	BrainLayer	Mem0	Zep/Graphiti	Letta	LangChain Memory
MCP native	9 tools	1 server	1 server	No	No
Think / Recall	Yes	No	No	No	No
Local-first	SQLite	Cloud-first	Cloud-only	Docker+PG	Framework
Zero infra	`pip install`	API key	API key	Docker	Multiple deps
Multi-source	7 sources	API only	API only	API only	API only
Enrichment	10 fields	Basic	Temporal	Self-write	None
Session analysis	Yes	No	No	No	No
Real-time	Per-message hooks	No	No	No	No
Open source	Apache 2.0	Apache 2.0	Source-available	Apache 2.0	MIT

BrainLayer is the only memory layer that:

Thinks before answering — categorizes past knowledge by intent (decisions, bugs, patterns) instead of raw search results
Runs on a single file — no database servers, no Docker, no cloud accounts
Works with every MCP client — 9 tools, instant integration, zero SDK
Knowledge graph — entities, relations, and person lookup across all indexed data

CLI Reference

brainlayer init               # Interactive setup wizard
brainlayer index              # Index new conversations
brainlayer search "query"     # Semantic + keyword search
brainlayer enrich             # Run LLM enrichment on new chunks
brainlayer enrich-sessions    # Session-level analysis (decisions, learnings)
brainlayer stats              # Database statistics
brainlayer brain-export       # Generate brain graph JSON
brainlayer export-obsidian    # Export to Obsidian vault
brainlayer dashboard          # Interactive TUI dashboard

Configuration

All configuration is via environment variables:

Variable	Default	Description
`BRAINLAYER_DB`	`~/.local/share/brainlayer/brainlayer.db`	Database file path
`BRAINLAYER_ENRICH_BACKEND`	auto-detect (MLX → Ollama → Groq)	Enrichment LLM backend (`mlx`, `ollama`, or `groq`)
`BRAINLAYER_ENRICH_MODEL`	`glm-4.7-flash`	Ollama model name
`BRAINLAYER_MLX_MODEL`	`mlx-community/Qwen2.5-Coder-14B-Instruct-4bit`	MLX model identifier
`BRAINLAYER_OLLAMA_URL`	`http://127.0.0.1:11434/api/generate`	Ollama API endpoint
`BRAINLAYER_MLX_URL`	`http://127.0.0.1:8080/v1/chat/completions`	MLX server endpoint
`BRAINLAYER_STALL_TIMEOUT`	`300`	Seconds before killing a stuck enrichment chunk
`BRAINLAYER_HEARTBEAT_INTERVAL`	`25`	Log progress every N chunks during enrichment
`BRAINLAYER_SANITIZE_EXTRA_NAMES`	(empty)	Comma-separated names to redact from indexed content
`BRAINLAYER_SANITIZE_USE_SPACY`	`true`	Use spaCy NER for PII detection
`GROQ_API_KEY`	(unset)	Groq API key for cloud enrichment backend
`BRAINLAYER_GROQ_URL`	`https://api.groq.com/openai/v1/chat/completions`	Groq API endpoint
`BRAINLAYER_GROQ_MODEL`	`llama-3.3-70b-versatile`	Groq model for enrichment

Optional Extras

pip install "brainlayer[brain]"     # Brain graph visualization (Leiden + UMAP) + FAISS
pip install "brainlayer[cloud]"     # Cloud backfill (Gemini Batch API)
pip install "brainlayer[youtube]"   # YouTube transcript indexing
pip install "brainlayer[ast]"       # AST-aware code chunking (tree-sitter)
pip install "brainlayer[kg]"        # GliNER entity extraction (209M params, EN+HE)
pip install "brainlayer[style]"     # ChromaDB vector store (alternative backend)
pip install "brainlayer[dev]"       # Development: pytest, ruff

Data Sources

BrainLayer can index conversations from multiple sources:

Source	Format	Indexer
Claude Code	JSONL (`~/.claude/projects/`)	`brainlayer index`
Claude Desktop	JSON export	`brainlayer index --source desktop`
WhatsApp	Exported `.txt` chat	`brainlayer index --source whatsapp`
YouTube	Transcripts via yt-dlp	`brainlayer index --source youtube`
Codex CLI	JSONL (`~/.codex/sessions`)	`brainlayer ingest-codex`
Markdown	Any `.md` files	`brainlayer index --source markdown`
Manual	Via MCP tool	`brain_store`
Real-time	Claude Code hooks	Live per-message indexing (zero-lag)

Testing

pip install -e ".[dev]"
pytest tests/                           # Full suite (1,002 Python tests)
pytest tests/ -m "not integration"      # Unit tests only (fast)
ruff check src/                         # Linting
# BrainBar (Swift): 28 tests via Xcode

Roadmap

See docs/roadmap.md for planned features including boot context loading, compact search, pinned memories, and MCP Registry listing.

Contributing

Contributions welcome! See CONTRIBUTING.md for dev setup, testing, and PR guidelines.

License

Apache 2.0 — see LICENSE.

Origin

BrainLayer was originally developed as "Zikaron" (Hebrew: memory) inside a personal AI agent ecosystem. It was extracted into a standalone project because every developer deserves persistent AI memory — not just the ones building their own agent systems.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured