MCP Servers

Muninn

A local-first persistent memory server that provides AI agents with deterministic, multimodal information retrieval across different sessions and projects. It enables long-term memory continuity using a 5-signal hybrid search engine and cognitive reasoning loops designed for complex development workflows.

README

Muninn

"Muninn flies each day over the world to bring Odin knowledge of what happens." — Prose Edda

Local-first persistent memory infrastructure for coding agents and MCP-compatible tools.

Muninn provides deterministic, explainable memory retrieval with robust transport behavior and production-grade operational controls. Designed for long-running development workflows where continuity, auditability, and measurable quality matter — across sessions, across assistants, and across projects.

🚩 Status

Current Version: v3.24.0 (Phase 26 COMPLETE) Stability: Production Beta Test Suite: 1422+ passing, 0 failing

What's New in v3.24.0

Cognitive Architecture (CoALA): Integration of a proactive reasoning loop bridging memory with active decision-making.
Knowledge Distillation: Background synthesis of episodic memories into structured semantic manuals for long-term wisdom.
Epistemic Foraging: Active inference-driven search to resolve ambiguities and fill information gaps autonomously.
Omission Filtering: Automated detection of missing context required for successful task execution.
Elo-Rated SNIPS Governance: Dynamic memory retention system mapping retrieval success to Elo ratings for usage-driven decay.

Previous Milestones

Version	Phase	Key Feature
v3.24.0	26	Cognitive Architecture Complete
v3.23.0	23	Elo-Rated SNIPS Governance
v3.22.0	22	Temporal Knowledge Graph
v3.19.0	20	Multimodal Hive Mind Operations
v3.18.3	19	Bulk legacy import, NLI conflict detection, uncapped discovery
v3.18.1	19	Scout synthesis, hunt mode

🚀 Features

Core Memory Engine

Local-First: Zero cloud dependency — all data stays on your machine
Multimodal: Native support for Text, Image, Audio, Video, and Sensor data
5-Signal Hybrid Retrieval: Dense vector · BM25 lexical · Graph traversal · Temporal relevance · Goal relevance
Explainable Recall Traces: Per-signal score attribution on every search result
Bi-Temporal Reasoning: Support for "Valid Time" vs "Transaction Time" via Temporal Knowledge Graph
Project Isolation: scope="project" memories never cross repo boundaries; scope="global" memories are always available
Cross-Session Continuity: Memories survive session ends, assistant switches, and tool restarts
Bi-Temporal Records: created_at (real-world event time) vs ingested_at (system intake time)

Memory Lifecycle

Elo-Rated Governance: Dynamic retention driven by retrieval feedback (SNIPS) and usage statistics
Consolidation Daemon: Background process for decay, deduplication, promotion, and shadowing — inspired by sleep consolidation
Zero-Trust Ingestion: Isolated subprocess parsing for PDF/DOCX to neutralize document-based exploits
ColBERT Multi-Vector: Native Qdrant multi-vector storage for MaxSim scoring
NL Temporal Query Expansion: Natural-language time phrases ("last week", "before the refactor") parsed into structured time ranges
Goal Compass: Retrieval signal for project objectives and constraint drift
NLI Conflict Detection: Transformer-based contradiction detection (cross-encoder/nli-deberta-v3-small) for memory integrity
Bulk Legacy Import: One-click ingestion of all discovered legacy sources (batched, error-isolated) via dashboard or API

Operational Controls

MCP Transport Hardening: Framed + line JSON-RPC, timeout-window guardrails, protocol negotiation
Runtime Profile Control: get_model_profiles / set_model_profiles for dynamic model routing
Profile Audit Log: Immutable event ledger for profile policy mutations
Browser Control Center: Web UI for search, ingestion, consolidation, and admin at http://localhost:42069
OpenTelemetry: GenAI semantic convention tracing (feature-gated via MUNINN_OTEL_ENABLED)

Multi-Assistant Interop

Handoff Bundles: Export/import memory checkpoints with checksum verification and idempotent replay
Legacy Migration: Discover and import memories from prior assistant sessions (JSONL chat history, SQLite state) — uncapped provider limits
Bulk Import: POST /ingest/legacy/import-all ingests all discovered sources in batches of 50 with per-batch error isolation
Hive Mind Federation: Push-based low-latency memory synchronization across assistant runtimes
MCP 2025-11 Compliant: Full protocol negotiation, lifecycle gating, schema annotations

Quick Start

git clone https://github.com/wjohns989/Muninn.git
cd Muninn
pip install -e .

Set the auth token (shared between server and MCP wrapper):

# Windows (persists across sessions)
setx MUNINN_AUTH_TOKEN "your-token-here"

# Linux/macOS
export MUNINN_AUTH_TOKEN="your-token-here"

Start the backend:

python server.py

Verify it's running:

curl http://localhost:42069/health
# {"status":"ok","memory_count":0,...,"backend":"muninn-native"}

Runtime Modes

Mode	Command	Description
Muninn MCP	`python mcp_wrapper.py`	stdio MCP server for active assistant/IDE sessions
Huginn Standalone	`python muninn_standalone.py`	Browser-first UX for direct ingestion/search/admin
REST API	`python server.py`	FastAPI backend at `http://localhost:42069`
Packaged App	`python scripts/build_standalone.py`	PyInstaller executable (Huginn Control Center)

All modes use the same memory engine and data directory.

MCP Client Configuration

Claude Code (recommended — bakes auth token into registration):

claude mcp add -s user muninn \
  -e MUNINN_AUTH_TOKEN="your-token-here" \
  -- python /absolute/path/to/mcp_wrapper.py

Generic MCP client (claude_desktop_config.json or equivalent):

{
  "mcpServers": {
    "muninn": {
      "command": "python",
      "args": ["/absolute/path/to/mcp_wrapper.py"],
      "env": {
        "MUNINN_AUTH_TOKEN": "your-token-here"
      }
    }
  }
}

Important: Both server.py and mcp_wrapper.py must share the same MUNINN_AUTH_TOKEN. If either process generates a random token (when the env var is unset), all MCP tool calls fail with 401.

MCP Tools

Tool	Description
`add_memory`	Store a memory with optional `scope`, `project`, `namespace`, `media_type`
`search_memory`	Hybrid 5-signal search with `media_type` filtering and recall traces
`get_all_memories`	Paginated memory listing with filters
`update_memory`	Update content or metadata of an existing memory
`delete_memory`	Remove a memory by ID
`set_project_goal`	Set the current project's objective and constraints
`get_project_goal`	Retrieve the active project goal
`set_project_instruction`	Store a project-scoped rule (`scope="project"` by default)
`get_model_profiles`	Get active model routing profiles
`set_model_profiles`	Update model routing profiles
`get_model_profile_events`	Audit log for profile policy changes
`export_handoff`	Export a memory handoff bundle
`import_handoff`	Import a handoff bundle (idempotent)
`ingest_sources`	Ingest files/folders into memory
`discover_legacy_sources`	Find prior assistant session files for migration
`ingest_legacy_sources`	Import discovered legacy memories
`record_retrieval_feedback`	Submit outcome signal for adaptive calibration

Python SDK

from muninn import Memory

# Sync client
client = Memory(base_url="http://127.0.0.1:42069", auth_token="your-token-here")
client.add(
    content="Always use typed Pydantic models for API payloads",
    metadata={"project": "muninn", "scope": "project"}
)

results = client.search("API payload patterns", limit=5)
for r in results:
    print(r.content, r.recall_trace)

Async client:

from muninn import AsyncMemory

async def main():
    async with AsyncMemory(base_url="http://127.0.0.1:42069", auth_token="your-token-here") as client:
        await client.add(content="...", metadata={})
        results = await client.search("...", limit=5)

REST API

Method	Path	Description
`GET`	`/health`	Server health + memory/vector/graph counts
`POST`	`/add`	Add a memory (supports `media_type`)
`POST`	`/search`	Hybrid search (supports `media_type` filtering)
`GET`	`/get_all`	Paginated memory listing
`PUT`	`/update`	Update a memory
`DELETE`	`/delete/{memory_id}`	Delete a memory
`POST`	`/ingest`	Ingest files/folders
`POST`	`/ingest/legacy/discover`	Discover legacy session files
`POST`	`/ingest/legacy/import`	Import selected legacy memories
`POST`	`/ingest/legacy/import-all`	Discover and import ALL legacy sources (batched)
`GET`	`/ingest/legacy/status`	Legacy discovery scheduler status
`GET`	`/ingest/legacy/catalog`	Paginated cached catalog of discovered sources
`GET`	`/profiles/model`	Get model routing profiles
`POST`	`/profiles/model`	Set model routing profiles
`GET`	`/profiles/model/events`	Profile audit log
`GET`	`/profile/user/get`	Get user profile
`POST`	`/profile/user/set`	Update user profile
`POST`	`/handoff/export`	Export handoff bundle
`POST`	`/handoff/import`	Import handoff bundle
`POST`	`/feedback/retrieval`	Submit retrieval feedback
`GET`	`/goal/get`	Get project goal
`POST`	`/goal/set`	Set project goal

Auth: Authorization: Bearer <MUNINN_AUTH_TOKEN> required on all non-health endpoints.

Configuration

Key environment variables:

Variable	Default	Description
`MUNINN_AUTH_TOKEN`	random	Shared secret between server and MCP wrapper
`MUNINN_SERVER_URL`	`http://localhost:42069`	Backend URL for MCP wrapper
`MUNINN_PROJECT_SCOPE_STRICT`	off	`=1` disables cross-project fallback entirely
`MUNINN_MCP_SEARCH_PROJECT_FALLBACK`	off	`=1` enables global-scope fallback on empty results
`MUNINN_OPERATOR_MODEL_PROFILE`	`balanced`	Default model routing profile
`MUNINN_OTEL_ENABLED`	off	`=1` enables OpenTelemetry tracing
`MUNINN_OTEL_ENDPOINT`	`http://localhost:4318`	OTLP HTTP endpoint for trace export
`MUNINN_CHAINS_ENABLED`	off	`=1` enables graph memory chain detection (PRECEDES/CAUSES edges)
`MUNINN_COLBERT_MULTIVEC`	off	`=1` enables native ColBERT multi-vector storage
`MUNINN_FEDERATION_ENABLED`	off	`=1` enables P2P memory synchronization
`MUNINN_FEDERATION_PEERS`	-	Comma-separated list of peer base URLs
`MUNINN_FEDERATION_SYNC_ON_ADD`	off	`=1` enables real-time push-on-add to peers
`MUNINN_TEMPORAL_QUERY_EXPANSION`	off	`=1` enables NL time-phrase parsing in search

Evaluation & Quality Gates

Muninn includes an evaluation toolchain for measurable quality enforcement:

# Run full benchmark dev-cycle
python -m eval.ollama_local_benchmark dev-cycle

# Check phase hygiene gates
python -m eval.phase_hygiene

# Emit SOTA+ signed verdict artifact
python -m eval.ollama_local_benchmark sota-verdict \
  --longmemeval-report path/to/lme_report.json \
  --min-longmemeval-ndcg 0.60 \
  --min-longmemeval-recall 0.65 \
  --signing-key "$SOTA_SIGNING_KEY"

# Run LongMemEval adapter selftest (no server needed)
python eval/longmemeval_adapter.py --selftest

# Run StructMemEval adapter selftest (no server needed)
python eval/structmemeval_adapter.py --selftest

# Run StructMemEval against a live server
python eval/structmemeval_adapter.py \
  --dataset path/to/structmemeval.jsonl \
  --server-url http://localhost:42069 \
  --auth-token "$MUNINN_AUTH_TOKEN"

Metrics tracked: nDCG@k, Recall@k, MRR@k, Exact Match, token-F1, p50/p95 latency, significance testing (Bonferroni/BH correction), effect-size analysis.

The sota-verdict command emits a signed JSON artifact with commit_sha, SHA256 file hashes, and HMAC-SHA256 promotion_signature — enabling auditable, commit-pinned SOTA+ evidence.

Data & Security

Default data dir: ~/.local/share/AntigravityLabs/muninn/ (Linux/macOS) · %LOCALAPPDATA%\AntigravityLabs\muninn\ (Windows)
Storage: SQLite (metadata) + Qdrant (vectors) + KuzuDB (memory chains graph)
No cloud dependency: All data local by default
Auth: Bearer token required on all API calls; token shared via env var
Namespace isolation: user_id + namespace + project boundaries enforced at every retrieval layer

Documentation Index

Document	Description
`SOTA_PLUS_PLAN.md`	Active development phases and roadmap
`HANDOFF.md`	Operational setup, auth flow, known issues
`docs/ARCHITECTURE.md`	System architecture deep-dive
`docs/MUNINN_COMPREHENSIVE_ROADMAP.md`	Full feature roadmap (v3.1→v3.3+)
`docs/AGENT_CONTINUATION_RUNBOOK.md`	How to resume development across sessions
`docs/PYTHON_SDK.md`	Python SDK reference
`docs/INGESTION_PIPELINE.md`	Ingestion pipeline internals
`docs/OTEL_GENAI_OBSERVABILITY.md`	OpenTelemetry integration guide
`docs/PLAN_GAP_EVALUATION.md`	Gap analysis against SOTA memory systems

Licensing

Code: Apache License 2.0 (LICENSE)
Third-party dependency licenses remain with their respective owners
Attribution: See NOTICE

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured