Smara MCP Server
A vendor-neutral, user-sovereign memory layer for AI agents and tools, providing persistent, cross-tool memory that users fully own and control.
README
Smara: Sovereign Universal Memory MCP
Smara (स्मर) — from the Sanskrit root smṛ, meaning "to remember." In Vedic tradition, smaraṇa is the act of remembrance that preserves knowledge across time. Smara is memory that endures.
A Vendor-Neutral, User-Sovereign Memory Layer for AI Agents and Tools
Your mind. Your tools. Your memory.
Smara is an MCP (Model Context Protocol) server that gives any AI client — Claude, Cursor, custom agents — access to a persistent, cross-tool memory system that you fully own and control.
Why This Exists
AI memory is fragmented and vendor-locked. Every AI tool maintains its own siloed memory. You cannot carry context across tools, export your accumulated knowledge, or control what each tool can access. Smara solves this by providing a single memory layer that:
- You own — data lives on your machine, not a vendor's cloud
- Works everywhere — any MCP-compatible client connects instantly
- Runs locally — zero external API dependencies by default
- Stays private — encryption at rest, scoped access, full audit trail
- Exports freely — JSON, JSONL, Markdown — no lock-in
Quick Start
Option 1: npm (recommended)
# Install globally
npm install -g smara-mcp
# Start the persistent daemon (enables hooks + HTTP API)
smara-daemon start
# Verify
smara-daemon status
# → Daemon is running (PID ...) and healthy on port 3100.
Option 2: From Source
# Clone and install
git clone https://github.com/nnaveenraju/smara-mcp.git
cd smara-mcp
npm install
# Build and run
npm run build
node dist/index.js
Option 3: Docker
# Production
docker compose -f docker/docker-compose.yml up
# Development (hot-reload)
docker compose -f docker/docker-compose.yml --profile dev up
Connect to Your AI Client
Claude Desktop
// macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
// Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"smara": {
"command": "node",
"args": ["/absolute/path/to/smara-mcp/dist/index.js"]
}
}
}
Once configured, ask Claude: "Remember that this project uses Aurora PostgreSQL 15" — Claude will call smara.store automatically. Later: "What database does this project use?" — Claude will call smara.recall.
Claude Code
Create a .mcp.json at your project root (project-scoped) or ~/.claude/mcp_servers.json (global). Same schema as Claude Desktop:
{
"mcpServers": {
"smara": {
"command": "node",
"args": ["/absolute/path/to/smara-mcp/dist/index.js"]
}
}
}
Gemini CLI
// ~/.gemini/settings.json
{
"mcpServers": {
"smara": {
"command": "node",
"args": ["/absolute/path/to/smara-mcp/dist/index.js"]
}
}
}
Once configured, ask Gemini: "Remember that the go-live date is Q3 2026" — Gemini will call smara.store. Memories stored in any tool are recalled by any other tool.
Note on native binaries: On first install, native addons (
better-sqlite3,sqlite-vec,sharp) are compiled for your platform. If you hit errors at startup about missing bindings, runnpm rebuild better-sqlite3or removenode_modulesandpackage-lock.jsonthen re-runnpm install.
Daemon mode (new in v0.2.0)
Smara now runs as a persistent HTTP daemon on
127.0.0.1:3100. This enables:
- Fast hooks — shell adapters call the daemon via HTTP in ~20ms (no cold starts)
- Session context — add
source hooks/session-start.shto your shell rc to auto-load project memories oncd- Direct API —
curlthe daemon from any script or workflowMCP clients (Claude, Cursor, Gemini) still connect via stdio as before. The daemon shares the same SQLite database.
See
hooks/README.mdfor hook setup andIMPROVEMENT-PLAN.mdfor the roadmap.
MCP Tools
| Tool | Description |
|---|---|
smara.store |
Store a new memory with category, tags, and confidence |
smara.recall |
Hybrid semantic + keyword search across all memory |
smara.update |
Update an existing memory (auto-bumps version) |
smara.forget |
Soft-delete memories with full audit trail |
smara.context |
Assemble relevant context for a task (the killer feature) |
smara.export |
Export memories as JSON, JSONL, or Markdown |
Example: A Day with Smara
This walkthrough follows a realistic scenario — a morning session in Claude Code, an afternoon in Cursor, and a quick Gemini CLI check the next day. You store memories explicitly by asking your AI client; each tool calls smara.store or smara.recall on your behalf.
Morning — Claude Code, setting up the backend
You: "Remember that this project uses a microservices architecture on AWS EKS,
the API gateway is Kong, and the primary database is Aurora PostgreSQL 15"
Claude calls:
→ smara.store({
content: "Project uses microservices on AWS EKS. API gateway is Kong. Primary database is Aurora PostgreSQL 15.",
category: "domain",
tags: ["architecture", "aws", "eks", "kong", "aurora-postgresql"],
confidence: 0.9,
source: { tool: "claude-code", context: "UserPrompt:explicit" }
})
Later you hit a build error. Claude fixes it; you ask it to capture the fix:
→ smara.store({
content: "Kong Ingress Controller v3.1 requires annotation 'konghq.com/strip-path' instead of deprecated 'strip.path'. Fix: update all Ingress manifests.",
category: "process",
tags: ["error-fix", "kong", "kubernetes", "ingress"],
confidence: 0.85,
source: { tool: "claude-code", context: "UserPrompt:explicit" }
})
Smara links the two entries because they share the kong tag:
{ "sourceId": "019577a2-...", "targetId": "019577b1-...", "relation": "related_to", "strength": 0.75 }
Afternoon — switching to Cursor
You open Cursor and ask: "What do you know about this project?"
Cursor calls:
→ smara.recall({ query: "payments-api project architecture preferences", limit: 10 })
The response surfaces both entries from the morning — the architecture and the Kong fix — ranked by hybrid search (FTS + semantic similarity + recency). Cursor answers with full context without you re-explaining anything.
You add the timeline:
→ smara.store({
content: "Go-live date is Q3 2026. Deployment freeze starts June 15, 2026.",
category: "domain",
tags: ["timeline", "go-live", "deployment-freeze"],
confidence: 0.9,
source: { tool: "cursor", context: "UserPrompt:explicit" }
})
Next morning — quick check in Gemini CLI
You: "What's the deployment situation for this project?"
Gemini calls:
→ smara.recall({ query: "deployment timeline architecture", limit: 10 })
Ranked results — hybrid search (semantic × keyword × recency):
{
"results": [
{ "content": "Go-live Q3 2026. Freeze June 15.", "score": 0.94, "matchType": "hybrid", "source": { "tool": "cursor" } },
{ "content": "Microservices on EKS, Kong gateway, Aurora PG 15.", "score": 0.87, "matchType": "hybrid", "source": { "tool": "claude-code" } },
{ "content": "Kong v3.1: use konghq.com/strip-path annotation.", "score": 0.71, "matchType": "semantic", "source": { "tool": "claude-code" } }
],
"searchStrategy": "hybrid:rrf"
}
The timeline was stored in Cursor, the architecture and fix in Claude Code — Gemini assembled all of it from one recall call.
What makes this different
- Cross-tool continuity — memories stored in Claude are recalled in Cursor and surfaced in Gemini. No copy-paste, no re-explaining.
- Full provenance — every entry records which tool created it, when, and why.
- Memory links — related entries connect to each other; linked context surfaces together.
- Smart decay — episodic memories decay at 0.15/day, domain knowledge at 0.05. Identity preferences never decay.
- Confidence ranking — explicit stores score 0.9; higher confidence surfaces first in search.
- You own everything —
smara.export({ format: "json" })gives you a full dump. Local SQLite, no cloud, no vendor lock-in.
Architecture
See ARCHITECTURE.md for the full architecture documentation including the provider abstraction layer, interface specifications, and implementation guide.
Configuration
Create ~/.smara/config.toml:
[server]
transport = "stdio" # "stdio" for Claude Desktop, "http" for daemon, "sse" for Docker
[database]
provider = "sqlite" # Pluggable: "sqlite", "redis", "postgres"...
path = "~/.smara/memory.db"
[vector]
provider = "sqlite-vec" # Pluggable: "sqlite-vec", "pinecone", "qdrant"...
dimensions = 384
[embeddings]
provider = "local" # Pluggable: "local", "openai", "cohere"...
model = "Xenova/all-MiniLM-L6-v2"
[search]
provider = "hybrid"
fts_weight = 0.4
vector_weight = 0.6
All settings can also be set via environment variables (see docker/.env.example).
License
Dual-licensed under MIT or Apache 2.0, at your option.
Copyright 2026 Naveen Nadimpalli. See NOTICE for attribution details.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.