karve

karve

Provides persistent semantic memory for Claude Code via local embeddings and six MCP tools, enabling context storage and retrieval across sessions without cloud dependencies.

Category
Visit Server

README

<p align="center"> <img src="karve.webp" alt="karve — Viking semantic memory for Claude Code" width="100%"> </p>

<h1 align="center">karve</h1>

<p align="center"> <img alt="Python 3.11+" src="https://img.shields.io/badge/python-3.11%2B-blue"> <img alt="Platform: Apple Silicon" src="https://img.shields.io/badge/platform-Apple%20Silicon-silver"> <img alt="Coverage: 100%" src="https://img.shields.io/badge/coverage-100%25-brightgreen"> <img alt="Tests: 64" src="https://img.shields.io/badge/tests-64-green"> <img alt="mypy: strict" src="https://img.shields.io/badge/mypy-strict-blue"> </p>

<p align="center"> <em>Persistent semantic memory for Claude Code — local embeddings, zero cloud, six tools.</em> </p>


What is karve?

Claude Code is powerful but stateless. Every session starts cold: no memory of past decisions, preferred patterns, or project context. You re-explain, re-discover, and re-decide the same things.

karve gives Claude Code persistent, searchable memory that lives entirely on your Mac.

It runs two local servers — a 4B-parameter embedding model on Apple Silicon (via MLX) and an agent-native context database (OpenViking) — and exposes them to Claude as six MCP tools. Claude stores notes, searches past context, and retrieves project knowledge across sessions. Nothing leaves your machine.

Named after the karve, a light, fast class of Viking longship.


Why local semantic memory?

Cloud AI tools that promise "memory" route your context through remote servers. If your notes contain code decisions, architectural choices, or proprietary system designs, that's a meaningful privacy exposure.

karve is local-first:

  • Embeddings computed on-device by Qwen3-Embedding-4B-mxfp8 via MLX — Apple Silicon native, no GPU rental
  • Storage in OpenViking, an open-source context database that runs entirely on localhost
  • Retrieval by semantic similarity — not keyword matching, not brittle file search

OpenViking isn't a vector store you query with scripts. It uses a file-system interface (viking://user/memory/, viking://resources/, etc.) that Claude navigates autonomously. Think of it as a filesystem your AI can search by meaning.


Is karve right for you?

Scenario Fit
macOS Apple Silicon (M1 / M2 / M3 / M4) ✅ Required
Claude Code as your primary AI client ✅ Required
Single-user, local-only workflow ✅ Ideal
Intel Mac, Linux, or Windows ❌ MLX won't run
Teams sharing memory across machines ❌ Local stack only
Other AI clients (Cursor, Windsurf, etc.) ❌ MCP server targets Claude Code
Real-time or very large-scale retrieval ❌ Single-user, not designed for this

Quick Start

Prerequisites: macOS Apple Silicon · Python 3.11+ · uv

1. Clone and install

git clone <repo-url>
cd karve
uv sync

2. Create credentials

cp credentials.yml.dist credentials.yml

Edit credentials.yml — any string works for local use:

openviking:
  api_key: my-local-key

3. Start the stack

./scripts/start_openviking.sh

The first run downloads ~4 GB of model weights — allow ~5 minutes. Subsequent starts take a few seconds. Logs go to logs/embedding.log and logs/openviking.log.

4. Register the MCP with Claude Code

See MCP Registration below, then restart Claude Code.

5. Verify the connection

In any Claude Code session, ask Claude to run viking_status(). A healthy response confirms the stack is reachable.


MCP Registration

Add karve to your .mcp.json. For user-wide registration, create or edit ~/.claude/.mcp.json:

{
  "mcpServers": {
    "openviking": {
      "command": "uv",
      "args": ["--project", "/path/to/karve", "run", "python", "-m", "src.openviking_mcp_server"]
    }
  }
}

Replace /path/to/karve with the absolute path to your cloned repository. Restart Claude Code after saving.

Alternatively, register via the CLI:

claude mcp add openviking -s user -- uv --project /path/to/karve run python -m src.openviking_mcp_server

Project-scoped memory

By default all tools use the global viking:// namespace, so memories from different projects can mix. To isolate memory per project, add a KARVE_PROJECT env var in a project-level .mcp.json at your project root:

{
  "mcpServers": {
    "openviking": {
      "command": "uv",
      "args": ["--project", "/path/to/karve", "run", "python", "-m", "src.openviking_mcp_server"],
      "env": {
        "KARVE_PROJECT": "my-project-name"
      }
    }
  }
}

When KARVE_PROJECT is set:

  • Searches default to viking://user/projects/my-project-name/ instead of viking://
  • viking_remember stores at viking://user/projects/my-project-name/<category>/
  • Global search is still available by passing uri="viking://" explicitly

Without KARVE_PROJECT, all tools use the global viking:// namespace (original behaviour).


MCP Tools

Six tools become available once Claude Code restarts with the MCP registered:

Tool Purpose Key Parameters
viking_search Fast semantic similarity search query, uri="viking://", limit=5
viking_deep_search Intent-aware search with query expansion query, uri="viking://", limit=5
viking_read Read content at a specific URI uri, depth="overview"
viking_list Browse the context filesystem uri="viking://"
viking_remember Store text for future retrieval text, category="memory", name=""
viking_status Health check — returns server details

Depth levels for viking_read

Depth Approx. tokens Use when
abstract ~100 Quick triage — is this the right resource?
overview ~2000 Default — good balance of context
full complete Full document needed

URI scoping

All search and list tools accept a uri parameter to scope the query:

viking://               # everything
viking://user/          # all user-owned content
viking://user/memory/   # stored memories only
viking://resources/     # indexed resources

Architecture

┌──────────────────────────────────────────────────────┐
│  Claude Code                                          │
│                                                       │
│  viking_search  viking_deep_search  viking_read       │
│  viking_list    viking_remember     viking_status     │
└──────────────────────┬───────────────────────────────┘
                       │ stdio  (FastMCP subprocess)
                       ▼
┌──────────────────────────────────────────────────────┐
│  src/openviking_mcp_server.py                         │
│  FastMCP wrapper — thin HTTP bridge, no local state   │
└──────────────────────┬───────────────────────────────┘
                       │ HTTP  localhost:1933
                       ▼
┌──────────────────────────────────────────────────────┐
│  OpenViking server                                    │
│  Agent-native context database                        │
│  File-system interface: viking:// URIs                │
│  Three-tier loading: L0 abstract · L1 overview · L2   │
└──────────────────────┬───────────────────────────────┘
                       │ HTTP  localhost:8000
                       ▼
┌──────────────────────────────────────────────────────┐
│  MLX embedding server  (mlx-openai-server)            │
│  mlx-community/Qwen3-Embedding-4B-mxfp8               │
│  OpenAI-compatible API · Apple Silicon native         │
└──────────────────────────────────────────────────────┘

All components run on localhost. No external network calls.
Active ports written to ~/.openviking/runtime.json on each startup.

Configuration

config.yml — non-secret settings

Key Default Notes
embedding.model mlx-community/Qwen3-Embedding-4B-mxfp8 MLX model path
embedding.base_port 8000 Scans upward if occupied
openviking.base_port 1933 Scans upward if occupied
logging.level INFO DEBUG | INFO | WARNING | ERROR

Ports are dynamic: if a base port is occupied, the startup script finds the next free port. The MCP wrapper reads ~/.openviking/runtime.json at startup to locate the current ports — so restarting the stack never breaks the MCP connection.

credentials.yml — secrets (gitignored)

openviking:
  api_key: your-key-here   # any string — local auth only

Copy from credentials.yml.dist. Never commit this file.


Dashboard

When Claude Code spawns the MCP server, a status dashboard automatically opens in your browser. It polls the OpenViking REST API every 5 seconds and displays:

  • Server health and system status
  • Observer component health (queue, vikingdb, transaction)
  • Embedding server status and active model name
  • Active session count
  • Context filesystem root listing

The dashboard is a single static dashboard.html file — no build step, no web server required. It runs entirely client-side.


Development

uv sync                    # install all deps including dev tools
uv run pytest              # 64 tests, 100% coverage

Quality gates (all passing):

Tool Result
ruff zero violations
mypy --strict zero errors
pytest 64 tests, 100% coverage
bandit no security issues
interrogate 100% docstring coverage
pylint 9.77 / 10
radon all grade B or better
xenon max-absolute B

Acknowledgments

  • OpenViking — open-source agent-native context database by ByteDance Volcano Engine; the core storage and retrieval engine powering karve
  • FastMCP — the MCP server framework used here; v3.0 released January 2026, powers 70% of MCP servers with 1M+ downloads/day
  • MLX — Apple's array framework for fast on-device inference; makes local 4B-parameter embeddings practical on consumer hardware

<p align="center"> <sub>karve — light and fast, like the ship it's named after.</sub> </p>

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured