Muninn

Muninn

A local-first persistent memory server that provides AI agents with deterministic, multimodal information retrieval across different sessions and projects. It enables long-term memory continuity using a 5-signal hybrid search engine and cognitive reasoning loops designed for complex development workflows.

Category
Visit Server

README

<img src="assets/muninn_banner.jpeg" alt="Muninn — Persistent Memory MCP" width="100%"/>

Muninn

"Muninn flies each day over the world to bring Odin knowledge of what happens." — Prose Edda

Local-first persistent memory infrastructure for coding agents and MCP-compatible tools.

Muninn provides deterministic, explainable memory retrieval with robust transport behavior and production-grade operational controls. Designed for long-running development workflows where continuity, auditability, and measurable quality matter — across sessions, across assistants, and across projects.


🚩 Status

Current Version: v3.24.0 (Phase 26 COMPLETE) Stability: Production Beta Test Suite: 1422+ passing, 0 failing

What's New in v3.24.0

  • Cognitive Architecture (CoALA): Integration of a proactive reasoning loop bridging memory with active decision-making.
  • Knowledge Distillation: Background synthesis of episodic memories into structured semantic manuals for long-term wisdom.
  • Epistemic Foraging: Active inference-driven search to resolve ambiguities and fill information gaps autonomously.
  • Omission Filtering: Automated detection of missing context required for successful task execution.
  • Elo-Rated SNIPS Governance: Dynamic memory retention system mapping retrieval success to Elo ratings for usage-driven decay.

Previous Milestones

Version Phase Key Feature
v3.24.0 26 Cognitive Architecture Complete
v3.23.0 23 Elo-Rated SNIPS Governance
v3.22.0 22 Temporal Knowledge Graph
v3.19.0 20 Multimodal Hive Mind Operations
v3.18.3 19 Bulk legacy import, NLI conflict detection, uncapped discovery
v3.18.1 19 Scout synthesis, hunt mode

🚀 Features

Core Memory Engine

  • Local-First: Zero cloud dependency — all data stays on your machine
  • Multimodal: Native support for Text, Image, Audio, Video, and Sensor data
  • 5-Signal Hybrid Retrieval: Dense vector · BM25 lexical · Graph traversal · Temporal relevance · Goal relevance
  • Explainable Recall Traces: Per-signal score attribution on every search result
  • Bi-Temporal Reasoning: Support for "Valid Time" vs "Transaction Time" via Temporal Knowledge Graph
  • Project Isolation: scope="project" memories never cross repo boundaries; scope="global" memories are always available
  • Cross-Session Continuity: Memories survive session ends, assistant switches, and tool restarts
  • Bi-Temporal Records: created_at (real-world event time) vs ingested_at (system intake time)

Memory Lifecycle

  • Elo-Rated Governance: Dynamic retention driven by retrieval feedback (SNIPS) and usage statistics
  • Consolidation Daemon: Background process for decay, deduplication, promotion, and shadowing — inspired by sleep consolidation
  • Zero-Trust Ingestion: Isolated subprocess parsing for PDF/DOCX to neutralize document-based exploits
  • ColBERT Multi-Vector: Native Qdrant multi-vector storage for MaxSim scoring
  • NL Temporal Query Expansion: Natural-language time phrases ("last week", "before the refactor") parsed into structured time ranges
  • Goal Compass: Retrieval signal for project objectives and constraint drift
  • NLI Conflict Detection: Transformer-based contradiction detection (cross-encoder/nli-deberta-v3-small) for memory integrity
  • Bulk Legacy Import: One-click ingestion of all discovered legacy sources (batched, error-isolated) via dashboard or API

Operational Controls

  • MCP Transport Hardening: Framed + line JSON-RPC, timeout-window guardrails, protocol negotiation
  • Runtime Profile Control: get_model_profiles / set_model_profiles for dynamic model routing
  • Profile Audit Log: Immutable event ledger for profile policy mutations
  • Browser Control Center: Web UI for search, ingestion, consolidation, and admin at http://localhost:42069
  • OpenTelemetry: GenAI semantic convention tracing (feature-gated via MUNINN_OTEL_ENABLED)

Multi-Assistant Interop

  • Handoff Bundles: Export/import memory checkpoints with checksum verification and idempotent replay
  • Legacy Migration: Discover and import memories from prior assistant sessions (JSONL chat history, SQLite state) — uncapped provider limits
  • Bulk Import: POST /ingest/legacy/import-all ingests all discovered sources in batches of 50 with per-batch error isolation
  • Hive Mind Federation: Push-based low-latency memory synchronization across assistant runtimes
  • MCP 2025-11 Compliant: Full protocol negotiation, lifecycle gating, schema annotations

Quick Start

git clone https://github.com/wjohns989/Muninn.git
cd Muninn
pip install -e .

Set the auth token (shared between server and MCP wrapper):

# Windows (persists across sessions)
setx MUNINN_AUTH_TOKEN "your-token-here"

# Linux/macOS
export MUNINN_AUTH_TOKEN="your-token-here"

Start the backend:

python server.py

Verify it's running:

curl http://localhost:42069/health
# {"status":"ok","memory_count":0,...,"backend":"muninn-native"}

Runtime Modes

Mode Command Description
Muninn MCP python mcp_wrapper.py stdio MCP server for active assistant/IDE sessions
Huginn Standalone python muninn_standalone.py Browser-first UX for direct ingestion/search/admin
REST API python server.py FastAPI backend at http://localhost:42069
Packaged App python scripts/build_standalone.py PyInstaller executable (Huginn Control Center)

All modes use the same memory engine and data directory.


MCP Client Configuration

Claude Code (recommended — bakes auth token into registration):

claude mcp add -s user muninn \
  -e MUNINN_AUTH_TOKEN="your-token-here" \
  -- python /absolute/path/to/mcp_wrapper.py

Generic MCP client (claude_desktop_config.json or equivalent):

{
  "mcpServers": {
    "muninn": {
      "command": "python",
      "args": ["/absolute/path/to/mcp_wrapper.py"],
      "env": {
        "MUNINN_AUTH_TOKEN": "your-token-here"
      }
    }
  }
}

Important: Both server.py and mcp_wrapper.py must share the same MUNINN_AUTH_TOKEN. If either process generates a random token (when the env var is unset), all MCP tool calls fail with 401.


MCP Tools

Tool Description
add_memory Store a memory with optional scope, project, namespace, media_type
search_memory Hybrid 5-signal search with media_type filtering and recall traces
get_all_memories Paginated memory listing with filters
update_memory Update content or metadata of an existing memory
delete_memory Remove a memory by ID
set_project_goal Set the current project's objective and constraints
get_project_goal Retrieve the active project goal
set_project_instruction Store a project-scoped rule (scope="project" by default)
get_model_profiles Get active model routing profiles
set_model_profiles Update model routing profiles
get_model_profile_events Audit log for profile policy changes
export_handoff Export a memory handoff bundle
import_handoff Import a handoff bundle (idempotent)
ingest_sources Ingest files/folders into memory
discover_legacy_sources Find prior assistant session files for migration
ingest_legacy_sources Import discovered legacy memories
record_retrieval_feedback Submit outcome signal for adaptive calibration

Python SDK

from muninn import Memory

# Sync client
client = Memory(base_url="http://127.0.0.1:42069", auth_token="your-token-here")
client.add(
    content="Always use typed Pydantic models for API payloads",
    metadata={"project": "muninn", "scope": "project"}
)

results = client.search("API payload patterns", limit=5)
for r in results:
    print(r.content, r.recall_trace)

Async client:

from muninn import AsyncMemory

async def main():
    async with AsyncMemory(base_url="http://127.0.0.1:42069", auth_token="your-token-here") as client:
        await client.add(content="...", metadata={})
        results = await client.search("...", limit=5)

REST API

Method Path Description
GET /health Server health + memory/vector/graph counts
POST /add Add a memory (supports media_type)
POST /search Hybrid search (supports media_type filtering)
GET /get_all Paginated memory listing
PUT /update Update a memory
DELETE /delete/{memory_id} Delete a memory
POST /ingest Ingest files/folders
POST /ingest/legacy/discover Discover legacy session files
POST /ingest/legacy/import Import selected legacy memories
POST /ingest/legacy/import-all Discover and import ALL legacy sources (batched)
GET /ingest/legacy/status Legacy discovery scheduler status
GET /ingest/legacy/catalog Paginated cached catalog of discovered sources
GET /profiles/model Get model routing profiles
POST /profiles/model Set model routing profiles
GET /profiles/model/events Profile audit log
GET /profile/user/get Get user profile
POST /profile/user/set Update user profile
POST /handoff/export Export handoff bundle
POST /handoff/import Import handoff bundle
POST /feedback/retrieval Submit retrieval feedback
GET /goal/get Get project goal
POST /goal/set Set project goal

Auth: Authorization: Bearer <MUNINN_AUTH_TOKEN> required on all non-health endpoints.


Configuration

Key environment variables:

Variable Default Description
MUNINN_AUTH_TOKEN random Shared secret between server and MCP wrapper
MUNINN_SERVER_URL http://localhost:42069 Backend URL for MCP wrapper
MUNINN_PROJECT_SCOPE_STRICT off =1 disables cross-project fallback entirely
MUNINN_MCP_SEARCH_PROJECT_FALLBACK off =1 enables global-scope fallback on empty results
MUNINN_OPERATOR_MODEL_PROFILE balanced Default model routing profile
MUNINN_OTEL_ENABLED off =1 enables OpenTelemetry tracing
MUNINN_OTEL_ENDPOINT http://localhost:4318 OTLP HTTP endpoint for trace export
MUNINN_CHAINS_ENABLED off =1 enables graph memory chain detection (PRECEDES/CAUSES edges)
MUNINN_COLBERT_MULTIVEC off =1 enables native ColBERT multi-vector storage
MUNINN_FEDERATION_ENABLED off =1 enables P2P memory synchronization
MUNINN_FEDERATION_PEERS - Comma-separated list of peer base URLs
MUNINN_FEDERATION_SYNC_ON_ADD off =1 enables real-time push-on-add to peers
MUNINN_TEMPORAL_QUERY_EXPANSION off =1 enables NL time-phrase parsing in search

Evaluation & Quality Gates

Muninn includes an evaluation toolchain for measurable quality enforcement:

# Run full benchmark dev-cycle
python -m eval.ollama_local_benchmark dev-cycle

# Check phase hygiene gates
python -m eval.phase_hygiene

# Emit SOTA+ signed verdict artifact
python -m eval.ollama_local_benchmark sota-verdict \
  --longmemeval-report path/to/lme_report.json \
  --min-longmemeval-ndcg 0.60 \
  --min-longmemeval-recall 0.65 \
  --signing-key "$SOTA_SIGNING_KEY"

# Run LongMemEval adapter selftest (no server needed)
python eval/longmemeval_adapter.py --selftest

# Run StructMemEval adapter selftest (no server needed)
python eval/structmemeval_adapter.py --selftest

# Run StructMemEval against a live server
python eval/structmemeval_adapter.py \
  --dataset path/to/structmemeval.jsonl \
  --server-url http://localhost:42069 \
  --auth-token "$MUNINN_AUTH_TOKEN"

Metrics tracked: nDCG@k, Recall@k, MRR@k, Exact Match, token-F1, p50/p95 latency, significance testing (Bonferroni/BH correction), effect-size analysis.

The sota-verdict command emits a signed JSON artifact with commit_sha, SHA256 file hashes, and HMAC-SHA256 promotion_signature — enabling auditable, commit-pinned SOTA+ evidence.


Data & Security

  • Default data dir: ~/.local/share/AntigravityLabs/muninn/ (Linux/macOS) · %LOCALAPPDATA%\AntigravityLabs\muninn\ (Windows)
  • Storage: SQLite (metadata) + Qdrant (vectors) + KuzuDB (memory chains graph)
  • No cloud dependency: All data local by default
  • Auth: Bearer token required on all API calls; token shared via env var
  • Namespace isolation: user_id + namespace + project boundaries enforced at every retrieval layer

Documentation Index

Document Description
SOTA_PLUS_PLAN.md Active development phases and roadmap
HANDOFF.md Operational setup, auth flow, known issues
docs/ARCHITECTURE.md System architecture deep-dive
docs/MUNINN_COMPREHENSIVE_ROADMAP.md Full feature roadmap (v3.1→v3.3+)
docs/AGENT_CONTINUATION_RUNBOOK.md How to resume development across sessions
docs/PYTHON_SDK.md Python SDK reference
docs/INGESTION_PIPELINE.md Ingestion pipeline internals
docs/OTEL_GENAI_OBSERVABILITY.md OpenTelemetry integration guide
docs/PLAN_GAP_EVALUATION.md Gap analysis against SOTA memory systems

Licensing

  • Code: Apache License 2.0 (LICENSE)
  • Third-party dependency licenses remain with their respective owners
  • Attribution: See NOTICE

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured