neuromcp
Semantic memory for AI agents — local-first MCP server with hybrid search, knowledge graph, contradiction detection, and plan-then-commit consolidation.
README
neuromcp
Semantic memory for AI agents — local-first MCP server with hybrid search, governance, and consolidation.
npx neuromcp
Why
AI agents forget everything between sessions. The default MCP memory server stores flat key-value pairs with keyword search — fine for "remember my name is Bob", useless for "what was the architectural decision we made about authentication last week?"
neuromcp solves this with hybrid search (vector embeddings + full-text), memory governance (namespaces, trust levels, lineage tracking), and automatic consolidation (dedup, decay, prune) — all running locally in a single SQLite file. No cloud, no API keys, no infrastructure.
Before & After
| Without neuromcp | With neuromcp | |
|---|---|---|
| Session memory | Gone when you close the terminal | Persisted, searchable, ranked by relevance |
| Search | Exact keyword match | Semantic — "auth architecture" finds "JWT validation middleware" |
| Duplicates | Same fact stored 50 times | Content-hash dedup + similarity-based merge |
| Stale memories | Accumulate forever | Automatic decay, pruning, and TTL sweeps |
| Multi-project | Everything in one pile | Namespace isolation per project |
| Trust | All memories equal | Trust levels (high/medium/low) + source tracking |
| Setup | API keys, cloud accounts, config files | npx neuromcp — zero config, runs locally |
How It Works
Query: "how does auth work in this project?"
│
▼
┌───────────────────────┐
│ Hybrid Search │
│ │
│ Vector (semantic) │──→ "JWT tokens validated in middleware" (0.87)
│ FTS (keyword) │──→ "auth module uses passport.js" (0.91)
│ │
│ RRF Fusion │──→ Combined ranking, best results first
└───────────────────────┘
│
▼
Filters: namespace, category, trust, date range
│
▼
Top results returned to agent
Embeddings: Auto-detects Ollama at localhost:11434 for real semantic search (nomic-embed-text). Falls back to built-in ONNX model (bge-small-en-v1.5) for basic similarity. Zero config either way.
Storage: Single SQLite file (~/.neuromcp/memory.db), WAL mode for concurrent access, content-hash deduplication at write time.
Consolidation: Plan-then-commit pipeline — preview what will change before applying. Deduplicates near-identical memories, decays old memories' importance over time, prunes below-threshold, sweeps tombstoned records past TTL.
Real-World Example
You: Where were we with the payment integration?
Claude: [searches neuromcp for "payment integration"]
Found 3 relevant memories:
- (0.92) "Decided on Stripe over PayPal — better webhook reliability,
session from 2026-03-28"
- (0.87) "Payment webhook handler at src/api/webhooks/stripe.ts needs
idempotency keys — known bug, not yet fixed"
- (0.81) "Test coverage for payment module is at 64%, target is 80%"
Based on our previous sessions: we chose Stripe, the webhook handler
needs idempotency keys (that's still open), and test coverage needs
work. Want me to start with the idempotency fix?
Quick Start
npx neuromcp
Creates ~/.neuromcp/memory.db on first run. Downloads ONNX model automatically.
Recommended: Add Ollama for real semantic search
# Install Ollama from https://ollama.com, then:
ollama pull nomic-embed-text
neuromcp auto-detects it. No config needed.
| Provider | Semantic Quality | Setup |
|---|---|---|
| Ollama + nomic-embed-text | Excellent — real semantic understanding, 8K context | ollama pull nomic-embed-text |
| ONNX (built-in fallback) | Basic — keyword overlap, no deep semantics | Zero config |
Installation
Claude Code
// ~/.claude.json → mcpServers
{
"neuromcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "neuromcp"]
}
}
Claude Desktop
// ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"neuromcp": {
"command": "npx",
"args": ["-y", "neuromcp"]
}
}
}
Cursor / Windsurf / Cline
Same format — add to your editor's MCP settings.
Per-project isolation
// .mcp.json in project root
{
"mcpServers": {
"neuromcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "neuromcp"],
"env": {
"NEUROMCP_DB_PATH": ".neuromcp/memory.db",
"NEUROMCP_NAMESPACE": "my-project"
}
}
}
}
MCP Surface
Tools (8)
| Tool | Description |
|---|---|
store_memory |
Store with semantic dedup. Returns ID and match status. |
search_memory |
Hybrid vector + FTS search with RRF ranking. Filters by namespace, category, tags, trust, date. |
recall_memory |
Retrieve by ID, namespace, category, or tags — no semantic search. |
forget_memory |
Soft-delete (tombstone). Supports dry_run. |
consolidate |
Dedup, decay, prune, sweep. commit=false for preview, true to apply. |
memory_stats |
Counts, categories, trust distribution, DB size. |
export_memories |
Export as JSONL or JSON. |
import_memories |
Import with content-hash dedup. |
Resources (13)
| URI | Description |
|---|---|
memory://stats |
Global statistics |
memory://recent |
Last 20 memories |
memory://namespaces |
All namespaces with counts |
memory://health |
Server health + metrics |
memory://stats/{namespace} |
Per-namespace stats |
memory://recent/{namespace} |
Recent in namespace |
memory://id/{id} |
Single memory by ID |
memory://tag/{tag} |
Memories by tag |
memory://tag/{namespace}/{tag} |
Tag within namespace |
memory://namespace/{ns} |
All in namespace (max 100) |
memory://consolidation/log |
Recent consolidation entries |
memory://consolidation/log/{id} |
Specific operation log |
memory://operations |
Active/recent operations |
Prompts (3)
| Prompt | Description |
|---|---|
memory_context_for_task |
Search relevant memories and format as LLM context |
review_memory_candidate |
Show proposed memory alongside near-duplicates |
consolidation_dry_run |
Preview consolidation without applying |
Memory Governance
Namespaces isolate memories by project, agent, or domain. Each memory belongs to exactly one namespace. Use NEUROMCP_NAMESPACE env var or specify per-operation.
Trust levels (high, medium, low, unverified) indicate confidence in the source. High-trust memories rank higher in search results and resist decay.
Soft delete tombstones memories instead of removing them. Tombstoned records survive for NEUROMCP_TOMBSTONE_TTL_DAYS (default 30) — recoverable until the next consolidation sweep.
Content hashing (SHA-256) deduplicates at write time. Identical content in the same namespace returns the existing memory instead of creating a duplicate.
Lineage tracking records source (user, auto, consolidation, claude-code, error), project ID, and agent ID per memory. Full audit trail for governance.
Configuration
All via environment variables. Defaults work for most setups.
| Variable | Default | Description |
|---|---|---|
NEUROMCP_DB_PATH |
~/.neuromcp/memory.db |
Database file path |
NEUROMCP_MAX_DB_SIZE_MB |
500 |
Max database size |
NEUROMCP_EMBEDDING_PROVIDER |
auto |
auto, onnx, ollama, openai |
NEUROMCP_EMBEDDING_MODEL |
auto |
Model name (auto-detected) |
OLLAMA_HOST |
http://localhost:11434 |
Ollama server URL |
NEUROMCP_DEFAULT_NAMESPACE |
default |
Default namespace |
NEUROMCP_TOMBSTONE_TTL_DAYS |
30 |
Days before permanent sweep |
NEUROMCP_AUTO_CONSOLIDATE |
false |
Enable periodic consolidation |
NEUROMCP_CONSOLIDATE_INTERVAL_HOURS |
24 |
Consolidation frequency |
NEUROMCP_DECAY_LAMBDA |
0.01 |
Importance decay rate |
NEUROMCP_DEDUP_THRESHOLD |
0.92 |
Cosine similarity for dedup |
NEUROMCP_MIN_IMPORTANCE |
0.05 |
Prune threshold |
NEUROMCP_AUTO_COMMIT_SIMILARITY |
0.95 |
Auto-merge threshold |
NEUROMCP_SWEEP_INTERVAL_HOURS |
6 |
TTL sweep frequency |
NEUROMCP_LOG_LEVEL |
info |
debug, info, warn, error |
Comparison
| Feature | neuromcp | @modelcontextprotocol/server-memory | mem0 | cortex-mcp |
|---|---|---|---|---|
| Search | Hybrid (vector + FTS + RRF) | Keyword only | Vector only | Vector only |
| Embeddings | Built-in ONNX (zero config) | None | External API | External API |
| Governance | Namespaces, trust, soft delete | None | None | Basic |
| Consolidation | Plan-then-commit | None | None | Manual |
| Storage | SQLite (single file) | JSON file | Cloud / Postgres | SQLite |
| Infrastructure | Zero | Zero | Cloud account | Zero |
| MCP surface | 8 tools, 13 resources, 3 prompts | 5 tools | N/A | 4 tools |
Contributing
See CONTRIBUTING.md for development setup and guidelines.
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.