Muninn
A local-first persistent memory server that provides AI agents with deterministic, multimodal information retrieval across different sessions and projects. It enables long-term memory continuity using a 5-signal hybrid search engine and cognitive reasoning loops designed for complex development workflows.
README
<img src="assets/muninn_banner.jpeg" alt="Muninn — Persistent Memory MCP" width="100%"/>
Muninn
"Muninn flies each day over the world to bring Odin knowledge of what happens." — Prose Edda
Local-first persistent memory infrastructure for coding agents and MCP-compatible tools.
Muninn provides deterministic, explainable memory retrieval with robust transport behavior and production-grade operational controls. Designed for long-running development workflows where continuity, auditability, and measurable quality matter — across sessions, across assistants, and across projects.
🚩 Status
Current Version: v3.24.0 (Phase 26 COMPLETE) Stability: Production Beta Test Suite: 1422+ passing, 0 failing
What's New in v3.24.0
- Cognitive Architecture (CoALA): Integration of a proactive reasoning loop bridging memory with active decision-making.
- Knowledge Distillation: Background synthesis of episodic memories into structured semantic manuals for long-term wisdom.
- Epistemic Foraging: Active inference-driven search to resolve ambiguities and fill information gaps autonomously.
- Omission Filtering: Automated detection of missing context required for successful task execution.
- Elo-Rated SNIPS Governance: Dynamic memory retention system mapping retrieval success to Elo ratings for usage-driven decay.
Previous Milestones
| Version | Phase | Key Feature |
|---|---|---|
| v3.24.0 | 26 | Cognitive Architecture Complete |
| v3.23.0 | 23 | Elo-Rated SNIPS Governance |
| v3.22.0 | 22 | Temporal Knowledge Graph |
| v3.19.0 | 20 | Multimodal Hive Mind Operations |
| v3.18.3 | 19 | Bulk legacy import, NLI conflict detection, uncapped discovery |
| v3.18.1 | 19 | Scout synthesis, hunt mode |
🚀 Features
Core Memory Engine
- Local-First: Zero cloud dependency — all data stays on your machine
- Multimodal: Native support for Text, Image, Audio, Video, and Sensor data
- 5-Signal Hybrid Retrieval: Dense vector · BM25 lexical · Graph traversal · Temporal relevance · Goal relevance
- Explainable Recall Traces: Per-signal score attribution on every search result
- Bi-Temporal Reasoning: Support for "Valid Time" vs "Transaction Time" via Temporal Knowledge Graph
- Project Isolation:
scope="project"memories never cross repo boundaries;scope="global"memories are always available - Cross-Session Continuity: Memories survive session ends, assistant switches, and tool restarts
- Bi-Temporal Records:
created_at(real-world event time) vsingested_at(system intake time)
Memory Lifecycle
- Elo-Rated Governance: Dynamic retention driven by retrieval feedback (SNIPS) and usage statistics
- Consolidation Daemon: Background process for decay, deduplication, promotion, and shadowing — inspired by sleep consolidation
- Zero-Trust Ingestion: Isolated subprocess parsing for PDF/DOCX to neutralize document-based exploits
- ColBERT Multi-Vector: Native Qdrant multi-vector storage for MaxSim scoring
- NL Temporal Query Expansion: Natural-language time phrases ("last week", "before the refactor") parsed into structured time ranges
- Goal Compass: Retrieval signal for project objectives and constraint drift
- NLI Conflict Detection: Transformer-based contradiction detection (
cross-encoder/nli-deberta-v3-small) for memory integrity - Bulk Legacy Import: One-click ingestion of all discovered legacy sources (batched, error-isolated) via dashboard or API
Operational Controls
- MCP Transport Hardening: Framed + line JSON-RPC, timeout-window guardrails, protocol negotiation
- Runtime Profile Control:
get_model_profiles/set_model_profilesfor dynamic model routing - Profile Audit Log: Immutable event ledger for profile policy mutations
- Browser Control Center: Web UI for search, ingestion, consolidation, and admin at
http://localhost:42069 - OpenTelemetry: GenAI semantic convention tracing (feature-gated via
MUNINN_OTEL_ENABLED)
Multi-Assistant Interop
- Handoff Bundles: Export/import memory checkpoints with checksum verification and idempotent replay
- Legacy Migration: Discover and import memories from prior assistant sessions (JSONL chat history, SQLite state) — uncapped provider limits
- Bulk Import:
POST /ingest/legacy/import-allingests all discovered sources in batches of 50 with per-batch error isolation - Hive Mind Federation: Push-based low-latency memory synchronization across assistant runtimes
- MCP 2025-11 Compliant: Full protocol negotiation, lifecycle gating, schema annotations
Quick Start
git clone https://github.com/wjohns989/Muninn.git
cd Muninn
pip install -e .
Set the auth token (shared between server and MCP wrapper):
# Windows (persists across sessions)
setx MUNINN_AUTH_TOKEN "your-token-here"
# Linux/macOS
export MUNINN_AUTH_TOKEN="your-token-here"
Start the backend:
python server.py
Verify it's running:
curl http://localhost:42069/health
# {"status":"ok","memory_count":0,...,"backend":"muninn-native"}
Runtime Modes
| Mode | Command | Description |
|---|---|---|
| Muninn MCP | python mcp_wrapper.py |
stdio MCP server for active assistant/IDE sessions |
| Huginn Standalone | python muninn_standalone.py |
Browser-first UX for direct ingestion/search/admin |
| REST API | python server.py |
FastAPI backend at http://localhost:42069 |
| Packaged App | python scripts/build_standalone.py |
PyInstaller executable (Huginn Control Center) |
All modes use the same memory engine and data directory.
MCP Client Configuration
Claude Code (recommended — bakes auth token into registration):
claude mcp add -s user muninn \
-e MUNINN_AUTH_TOKEN="your-token-here" \
-- python /absolute/path/to/mcp_wrapper.py
Generic MCP client (claude_desktop_config.json or equivalent):
{
"mcpServers": {
"muninn": {
"command": "python",
"args": ["/absolute/path/to/mcp_wrapper.py"],
"env": {
"MUNINN_AUTH_TOKEN": "your-token-here"
}
}
}
}
Important: Both
server.pyandmcp_wrapper.pymust share the sameMUNINN_AUTH_TOKEN. If either process generates a random token (when the env var is unset), all MCP tool calls fail with 401.
MCP Tools
| Tool | Description |
|---|---|
add_memory |
Store a memory with optional scope, project, namespace, media_type |
search_memory |
Hybrid 5-signal search with media_type filtering and recall traces |
get_all_memories |
Paginated memory listing with filters |
update_memory |
Update content or metadata of an existing memory |
delete_memory |
Remove a memory by ID |
set_project_goal |
Set the current project's objective and constraints |
get_project_goal |
Retrieve the active project goal |
set_project_instruction |
Store a project-scoped rule (scope="project" by default) |
get_model_profiles |
Get active model routing profiles |
set_model_profiles |
Update model routing profiles |
get_model_profile_events |
Audit log for profile policy changes |
export_handoff |
Export a memory handoff bundle |
import_handoff |
Import a handoff bundle (idempotent) |
ingest_sources |
Ingest files/folders into memory |
discover_legacy_sources |
Find prior assistant session files for migration |
ingest_legacy_sources |
Import discovered legacy memories |
record_retrieval_feedback |
Submit outcome signal for adaptive calibration |
Python SDK
from muninn import Memory
# Sync client
client = Memory(base_url="http://127.0.0.1:42069", auth_token="your-token-here")
client.add(
content="Always use typed Pydantic models for API payloads",
metadata={"project": "muninn", "scope": "project"}
)
results = client.search("API payload patterns", limit=5)
for r in results:
print(r.content, r.recall_trace)
Async client:
from muninn import AsyncMemory
async def main():
async with AsyncMemory(base_url="http://127.0.0.1:42069", auth_token="your-token-here") as client:
await client.add(content="...", metadata={})
results = await client.search("...", limit=5)
REST API
| Method | Path | Description |
|---|---|---|
GET |
/health |
Server health + memory/vector/graph counts |
POST |
/add |
Add a memory (supports media_type) |
POST |
/search |
Hybrid search (supports media_type filtering) |
GET |
/get_all |
Paginated memory listing |
PUT |
/update |
Update a memory |
DELETE |
/delete/{memory_id} |
Delete a memory |
POST |
/ingest |
Ingest files/folders |
POST |
/ingest/legacy/discover |
Discover legacy session files |
POST |
/ingest/legacy/import |
Import selected legacy memories |
POST |
/ingest/legacy/import-all |
Discover and import ALL legacy sources (batched) |
GET |
/ingest/legacy/status |
Legacy discovery scheduler status |
GET |
/ingest/legacy/catalog |
Paginated cached catalog of discovered sources |
GET |
/profiles/model |
Get model routing profiles |
POST |
/profiles/model |
Set model routing profiles |
GET |
/profiles/model/events |
Profile audit log |
GET |
/profile/user/get |
Get user profile |
POST |
/profile/user/set |
Update user profile |
POST |
/handoff/export |
Export handoff bundle |
POST |
/handoff/import |
Import handoff bundle |
POST |
/feedback/retrieval |
Submit retrieval feedback |
GET |
/goal/get |
Get project goal |
POST |
/goal/set |
Set project goal |
Auth: Authorization: Bearer <MUNINN_AUTH_TOKEN> required on all non-health endpoints.
Configuration
Key environment variables:
| Variable | Default | Description |
|---|---|---|
MUNINN_AUTH_TOKEN |
random | Shared secret between server and MCP wrapper |
MUNINN_SERVER_URL |
http://localhost:42069 |
Backend URL for MCP wrapper |
MUNINN_PROJECT_SCOPE_STRICT |
off | =1 disables cross-project fallback entirely |
MUNINN_MCP_SEARCH_PROJECT_FALLBACK |
off | =1 enables global-scope fallback on empty results |
MUNINN_OPERATOR_MODEL_PROFILE |
balanced |
Default model routing profile |
MUNINN_OTEL_ENABLED |
off | =1 enables OpenTelemetry tracing |
MUNINN_OTEL_ENDPOINT |
http://localhost:4318 |
OTLP HTTP endpoint for trace export |
MUNINN_CHAINS_ENABLED |
off | =1 enables graph memory chain detection (PRECEDES/CAUSES edges) |
MUNINN_COLBERT_MULTIVEC |
off | =1 enables native ColBERT multi-vector storage |
MUNINN_FEDERATION_ENABLED |
off | =1 enables P2P memory synchronization |
MUNINN_FEDERATION_PEERS |
- | Comma-separated list of peer base URLs |
MUNINN_FEDERATION_SYNC_ON_ADD |
off | =1 enables real-time push-on-add to peers |
MUNINN_TEMPORAL_QUERY_EXPANSION |
off | =1 enables NL time-phrase parsing in search |
Evaluation & Quality Gates
Muninn includes an evaluation toolchain for measurable quality enforcement:
# Run full benchmark dev-cycle
python -m eval.ollama_local_benchmark dev-cycle
# Check phase hygiene gates
python -m eval.phase_hygiene
# Emit SOTA+ signed verdict artifact
python -m eval.ollama_local_benchmark sota-verdict \
--longmemeval-report path/to/lme_report.json \
--min-longmemeval-ndcg 0.60 \
--min-longmemeval-recall 0.65 \
--signing-key "$SOTA_SIGNING_KEY"
# Run LongMemEval adapter selftest (no server needed)
python eval/longmemeval_adapter.py --selftest
# Run StructMemEval adapter selftest (no server needed)
python eval/structmemeval_adapter.py --selftest
# Run StructMemEval against a live server
python eval/structmemeval_adapter.py \
--dataset path/to/structmemeval.jsonl \
--server-url http://localhost:42069 \
--auth-token "$MUNINN_AUTH_TOKEN"
Metrics tracked: nDCG@k, Recall@k, MRR@k, Exact Match, token-F1, p50/p95 latency, significance testing (Bonferroni/BH correction), effect-size analysis.
The sota-verdict command emits a signed JSON artifact with commit_sha, SHA256 file hashes, and HMAC-SHA256 promotion_signature — enabling auditable, commit-pinned SOTA+ evidence.
Data & Security
- Default data dir:
~/.local/share/AntigravityLabs/muninn/(Linux/macOS) ·%LOCALAPPDATA%\AntigravityLabs\muninn\(Windows) - Storage: SQLite (metadata) + Qdrant (vectors) + KuzuDB (memory chains graph)
- No cloud dependency: All data local by default
- Auth: Bearer token required on all API calls; token shared via env var
- Namespace isolation:
user_id+namespace+projectboundaries enforced at every retrieval layer
Documentation Index
| Document | Description |
|---|---|
SOTA_PLUS_PLAN.md |
Active development phases and roadmap |
HANDOFF.md |
Operational setup, auth flow, known issues |
docs/ARCHITECTURE.md |
System architecture deep-dive |
docs/MUNINN_COMPREHENSIVE_ROADMAP.md |
Full feature roadmap (v3.1→v3.3+) |
docs/AGENT_CONTINUATION_RUNBOOK.md |
How to resume development across sessions |
docs/PYTHON_SDK.md |
Python SDK reference |
docs/INGESTION_PIPELINE.md |
Ingestion pipeline internals |
docs/OTEL_GENAI_OBSERVABILITY.md |
OpenTelemetry integration guide |
docs/PLAN_GAP_EVALUATION.md |
Gap analysis against SOTA memory systems |
Licensing
- Code: Apache License 2.0 (
LICENSE) - Third-party dependency licenses remain with their respective owners
- Attribution: See
NOTICE
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.