MCP Servers

context-mem

Your AI forgets everything between sessions. This fixes that — 98%+ retrieval accuracy, With llm 100% on LongMemEval, 99% token savings. 44 MCP tools. Fully local, zero cost.

README

Context Mem

Your AI coding assistant forgets everything between sessions. This fixes that.

</div>

The Problem

Every time you start a new AI session, your assistant has zero memory of what you built yesterday. The architecture decisions, the bugs you fixed, the preferences you stated — all gone. You spend the first 10 minutes re-explaining context.

The Fix

context-mem runs in the background, captures everything automatically, and retrieves exactly the right context when you need it:

Longer sessions without losing context (99% token savings)
Instant continuity — new sessions pick up where you left off
Automatic — no manual saving, no commands to remember
Fully local — your code never leaves your machine
Free — no API keys, no subscription, no cloud

npm i context-mem && npx context-mem init

One command. Works with Claude Code, Cursor, Windsurf, VS Code, Cline, and Roo Code.

Retrieval Benchmarks

Tested on 4 academic benchmarks. All scores are session-level retrieval recall (did the correct session appear in top-k?), not end-to-end QA accuracy.

Pure Local (zero API calls, fully free)

Benchmark	Retrieval Recall	Questions	Sessions/conv	Metric
LongMemEval	97.8% R@5	500	~53	Session R@5
LoCoMo	98.1% R@10	1,977	19-35	Session R@10
MemBench	98.0% R@5	500	—	Hybrid top-5
ConvoMem	97.7% R@10	250	—	Session R@10

With Optional LLM Reranking (Haiku, ~$1 per 500 queries)

Benchmark	Retrieval Recall
LongMemEval	100.0% R@5 (500/500)

vs MemPalace (same methodology — session-level retrieval recall)

Benchmark	Context Mem	MemPalace
LongMemEval R@5	97.8%	96.6%
LoCoMo R@10	98.1%	60.3%

Both systems achieve 100% on LME with optional Haiku reranking. MemPalace comparison uses identical methodology (session-level, same datasets).

<details> <summary>Benchmark methodology notes</summary>

Metric: Session-level retrieval recall — a hit is scored if any correct evidence session appears in the top-k results. This is different from end-to-end QA accuracy (retrieve + generate answer + judge), which would be lower for any system.
Granularity: Sessions (all dialog turns joined per session). LoCoMo has 19-35 sessions per conversation, so R@10 selects roughly a third of the candidate pool.
Ingestion: LoCoMo benchmark appends dataset-provided metadata (session_summary, observation, event_summary) to session documents. The production system does similar enrichment via summarizers and entity extraction.
Synonym expansions: Core query-builder includes general synonyms (movie→film, sibling→brother). Benchmark adapter adds ~50 additional domain-specific expansions derived from failure analysis. Core-only results are ~1-2% lower.
Benchmark code: Fully open in benchmarks/ — run them yourself with npm run bench.

</details>

How It Works

Every tool output flows through the pipeline: privacy screening (9 secret detectors) → parallel extraction (entities, importance, topics) → 14 content summarizers → triple storage (verbatim archive, SQLite summaries, knowledge graph) → adaptive compression over time.

Full coding session (50 tool outputs): 365 KB → 3.2 KB (99% savings).

What it is (and isn't)

context-mem is:

A retrieval-first memory system (not a chatbot wrapper)
A context compression engine (14 content-aware summarizers)
Infrastructure for AI agents (44 MCP tools)

context-mem is not:

Chat history storage (it extracts meaning, not raw logs)
An LLM wrapper (works without any API keys)
A cloud service (fully local SQLite)

Quick Start

npm i context-mem && npx context-mem init

init auto-detects your editor:

Editor	What gets created
Claude Code	`.mcp.json` + hooks (8 hooks incl. context-triggered injection) + CLAUDE.md
Cursor	`.cursor/mcp.json` + `.cursor/rules/context-mem.mdc`
Windsurf	`.windsurf/mcp.json` + `.windsurf/rules/context-mem.md`
VS Code / Copilot	`.vscode/mcp.json` + `.github/copilot-instructions.md`
Cline	`.cline/mcp_settings.json` + `.clinerules/context-mem.md`
Roo Code	`.roo-code/mcp_settings.json` + `.roo/rules/context-mem.md`

Real-World Examples

You: "Why did we choose Postgres?"
  → recall returns the exact verbatim quote from March 15, importance 0.95,
    with the full evidence chain: error → file_read → search → decision

You: "What did Sarah work on last sprint?"
  → browse by person shows 14 observations mentioning Sarah,
    grouped by topic (auth, database, deployment)

You: "Generate a PR description"
  → context-mem story --format pr assembles changes, decisions, resolved
    issues, and test plan from the current session

You: "What are we about to forget?"
  → predict_loss shows 8 entries at risk: low importance, 45+ days old,
    never accessed. Pin the critical ones before they decay.

Search Architecture

BM25 (8 strategies + synonym expansion) and vector search run independently in parallel, then fuse via intent-adaptive weights with IDF-weighted content reranking. Optional LLM judge reranker pushes accuracy to 100%. Fully local by default.

Core Features

Capability	Description
Importance Scoring	Every observation scored 0.0–1.0 with 6 significance flags: DECISION, ORIGIN, PIVOT, CORE, MILESTONE, PROBLEM. Auto-pin for decisions and milestones.
Verbatim Recall	Surface original content (not summaries) via `recall` tool. Dedicated FTS5 index. Importance, type, time, and flag filters.
Adaptive Compression	4-tier progressive: verbatim (0-7d) → light (7-30d) → medium (30-90d) → distilled (90d+). Pinned entries stay verbatim forever.
Entity Intelligence	Auto-detect technologies, people, file paths, CamelCase, ALL_CAPS. 100+ aliases (React.js → React). Knowledge graph storage.
Temporal Facts	`valid_from`/`valid_to` on knowledge. Supersession chains. `temporal_query`: "what was true about X at time T?"
Wake-Up Primer	Token-budgeted context at session start. 4 layers: profile (15%), critical knowledge (40%), decisions (30%), entities (15%).
Decision Trails	Evidence chain reconstruction. `explain_decision` walks events backward: file reads → errors → searches → decision.
Session Narratives	4 templates: PR description, standup update, ADR, onboarding guide. CLI: `context-mem story --format pr`.
Hybrid Search	BM25 (8 strategies + synonym expansion) + vector (nomic-embed 768-dim) parallel fusion. Optional LLM judge reranker. Sub-millisecond.
Temporal Resolver	Deterministic date parsing for relative time queries ("3 days ago", "last Saturday"). Zero LLM cost.
Per-Prompt Injection	UserPromptSubmit hook auto-injects relevant memories on every user message. Rate-limited, topic-deduplicated.
Knowledge Graph	Entity-relationship model: files, modules, patterns, decisions, bugs, people, libraries, services, APIs, configs.
Multi-Agent	Register, claim files, check status, broadcast. Shared memory prevents duplicate work and merge conflicts.
Privacy Engine	Fully local. `<private>` tag stripping, custom regex, 9 secret detectors. No telemetry, no cloud.

Intelligence Dashboard

Real-time web UI with 6 pages — context-mem dashboard to launch:

<details> <summary>More dashboard pages</summary>

Knowledge Graph — force-directed entity visualization with type filtering and depth control:

Topics — topic cloud with observation counts and cross-project tunnels:

Timeline — chronological observations with importance badges, flags, and verbatim mode:

</details>

How It Compares

	Context Mem v3.2	MemPalace	claude-mem
Retrieval Recall	98%+ session recall (4 benchmarks)	96.6% LME, 60.3% LoCoMo	Not benchmarked
Token Savings	99% (benchmarked)	0% (stores everything)	~95% (claimed)
Search	BM25 (8 strategies) + Vector + LLM Judge	ChromaDB	Basic recall
Entity Intelligence	Auto-detect + 100 aliases + graph	No	No
Importance Scoring	0.0-1.0 with 6 significance flags	No	No
Decision Trails	Evidence chain reconstruction	No	No
Session Narratives	PR/Standup/ADR/Onboarding	No	No
Cross-Project Memory	Global store + topic tunnels	No	No
LLM Dependency	Optional (free by default)	100% LME requires paid API	Required (~$57/mo)
Privacy	Fully local, 9 secret detectors	Local	Local
License	MIT	Proprietary	AGPL-3.0

Performance

All operations are sub-millisecond, zero LLM dependency:

Operation	Speed	Latency
Importance Classification	556K ops/s	0.002ms
Entity Extraction	179K ops/s	0.006ms
Topic Detection	162K ops/s	0.006ms
Compression Tier Calc	3M ops/s	<0.001ms
Verbatim FTS Search	50K ops/s	0.020ms
BM25 Search	3.3K ops/s	0.3ms
Wake-Up Primer Assembly	9K ops/s	0.111ms
Narrative Generation	6K ops/s	0.164ms

MCP Tools (44)

<details> <summary>Click to see all 44 tools</summary>

Tool	Description
Core
`observe`	Store observation with auto-summarization + importance scoring
`search`	Hybrid search with optional verbatim mode
`get`	Retrieve full observation by ID
`timeline`	Reverse-chronological list with importance badges
`stats`	Token economics for current session
`summarize`	Summarize content without storing
`configure`	Update runtime configuration
`execute`	Run code (JS, TS, Python, Shell, Ruby, Go, Rust, PHP, Perl, R, Elixir)
Content
`index_content`	Index with code-aware chunking
`search_content`	Search indexed chunks
Knowledge
`save_knowledge`	Save with contradiction detection + temporal validity
`search_knowledge`	Search (filters superseded by default)
`promote_knowledge`	Promote to global cross-project store
`global_search`	Search across all projects
`resolve_contradiction`	Resolve conflicts (supersede/merge/keep/archive)
`merge_suggestions`	View cross-project duplicate suggestions
Graph
`graph_query`	Traverse entity relationships
`add_relationship`	Link entities
`graph_neighbors`	Find connected entities
Session
`update_profile`	Project profile
`budget_status` / `budget_configure`	Token budget management
`restore_session`	Restore from snapshot
`handoff_session`	Cross-session continuity
Events
`emit_event` / `query_events`	P1-P4 event tracking
Agents
`agent_register` / `agent_status` / `claim_files` / `agent_broadcast`	Multi-agent coordination
Intelligence
`time_travel`	Compare project state at any point in time
`ask`	Natural language question answering
Total Recall
`recall`	Verbatim memory retrieval with importance/flag/time filters
`wake_up`	Generate scored session primer (4-layer context)
`entity_detect`	Extract entities from text
`list_people`	Person entities with relationship counts
`temporal_query`	Knowledge valid at specific timestamp
`browse`	Navigate by topic, person, or time
`list_topics`	Topic list with observation counts
`find_tunnels`	Cross-project topic bridges
`import_conversations`	Import ChatGPT/Claude/Slack/text conversations
`explain_decision`	Decision trail evidence chain
`generate_story`	Narrative (PR/standup/ADR/onboarding)
`predict_loss`	Memory pressure prediction

</details>

CLI Commands

context-mem init                    # Initialize in current project
context-mem serve                   # Start MCP server (stdio)
context-mem status                  # Show database stats
context-mem doctor                  # Run health checks
context-mem dashboard               # Open web dashboard (6 pages)
context-mem why <query>             # Decision trail — why was X decided?
context-mem story --format pr       # Generate narrative (pr/standup/adr/onboarding)
context-mem import-convos <path>    # Import conversations (auto-detect format)
context-mem export                  # Export as JSON
context-mem import                  # Import from JSON
context-mem plugin add|remove|list  # Manage summarizer plugins

Configuration

<details> <summary>.context-mem.json</summary>

{
  "storage": "auto",
  "plugins": {
    "summarizers": ["shell", "json", "error", "log", "code"],
    "search": ["bm25", "trigram", "vector"],
    "runtimes": ["javascript", "python"]
  },
  "search_weights": { "bm25": 0.45, "trigram": 0.15, "levenshtein": 0.05, "vector": 0.35 },
  "privacy": { "strip_tags": true, "redact_patterns": [] },
  "lifecycle": { "ttl_days": 30, "max_db_size_mb": 500, "max_observations": 50000 },
  "ai_curation": { "enabled": false, "provider": "auto" }
}

</details>

Platform Support

Platform	Auto-Setup
Claude Code, Cursor, Windsurf, VS Code/Copilot, Cline, Roo Code	`context-mem init`
Gemini CLI, Antigravity, Goose, OpenClaw, CrewAI, LangChain	See configs/

Documentation

Doc	Description
Benchmark Results	Compression + retrieval benchmarks
Contributing	How to contribute

License

MIT — Juba Kitiashvili

Get Started

npm i context-mem && npx context-mem init

Read the Docs · View Benchmarks · Report a Bug · Contributing

Context Mem v3.2 — 98%+ accuracy on every benchmark. Your AI never forgets.

</div>

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

E2B

Using MCP to run code via e2b.

Official

Featured