MCP Servers

memory-lancedb-mcp

Enables AI agents to store and recall persistent long-term memories across sessions using LanceDB, with semantic search, automatic linking, conflict detection, and maintenance tools.

README

Persistent, intelligent long-term memory for any MCP-compatible AI agent.

English | 繁體中文

</div>

Before / After

Without memory, every session starts from zero. With memory-lancedb-mcp, your agent accumulates knowledge across sessions — automatically.

Before — agent has no context:

User: "Use the same animation style as last time"
Agent: "I don't have any context about previous animations. Could you describe what you'd like?"

After — agent recalls past decisions:

<memories>
1. Remotion spring animation: use duration >= 20, damping 12-15 for smooth easing
2. Video export preset: 1080p, 30fps for social, 60fps for demo
</memories>
<refs>#1=6352a7d2 #2=bed148f0</refs>

Store responses are minimal — no noise, just confirmation:

Stored. [topic: remotion]

Quick Start

1. Install

npm install -g @cablate/memory-lancedb-mcp

2. Configure

Add to your MCP client settings (e.g. Claude Desktop claude_desktop_config.json):

{
  "mcpServers": {
    "memory": {
      "command": "npx",
      "args": ["-y", "@cablate/memory-lancedb-mcp"],
      "env": {
        "EMBEDDING_API_KEY": "your-api-key",
        "EMBEDDING_MODEL": "text-embedding-3-small"
      }
    }
  }
}

<details> <summary>Advanced: use a config file for full control</summary>

{
  "mcpServers": {
    "memory": {
      "command": "npx",
      "args": ["-y", "@cablate/memory-lancedb-mcp"],
      "env": {
        "MEMORY_LANCEDB_CONFIG": "/path/to/config.json"
      }
    }
  }
}

See config.example.json for all options.

</details>

How It Works

          store                          recall
            │                              │
   ┌────────▼────────┐           ┌────────▼────────┐
   │  Filter junk     │           │ Search by meaning │
   │  Save + embed    │           │   AND keywords    │
   │  Link related    │           │ Re-rank results   │
   │  Flag conflicts  │           │ Fade stale ones   │
   │  Tag topic       │           │ Pull in related   │
   └────────┬────────┘           │ Merge duplicates  │
            │                    └────────┬────────┘
            ▼                             ▼
   ┌─────────────────────────────────────────────┐
   │          LanceDB (local, zero-config)        │
   └─────────────────────────────────────────────┘

Every memory_store saves to a local database, automatically links related memories, flags contradictions, and assigns topic labels — no extra API calls needed. Every memory_recall searches by both meaning and keywords, pulls in related memories the main search might miss, and includes maintenance hints so the agent can keep its own knowledge base clean.

Features

Retrieval

Finds the right memory even when you use different words — searches by meaning and exact keywords simultaneously, then combines the best of both
More precise results, not just surface matches — an optional second pass re-ranks results by actual relevance (6 providers supported)
Search multiple topics at once — pass a queries array to search several keywords in one call; results are deduplicated and memories that match multiple queries rank higher
Finding A automatically surfaces related B — when a memory is found, its linked neighbors are pulled in too, even if they use completely different words
Minimal token overhead — responses use compact XML tags (<memories>, <hints>, <refs>) with short IDs, no category/scope noise

Storage

Related memories link themselves — when you store something new, it automatically creates bidirectional links to similar existing memories
Conflicts get flagged — if a new memory contradicts an existing one, you get a warning so nothing silently overwrites
Topics assigned automatically — each memory gets a topic label inferred from its content and neighbors; you can also set it explicitly
Junk gets filtered out — greetings, refusals, and meta-questions are rejected before they waste storage

Lifecycle

Frequently used memories stay sharp, stale ones fade — a decay model balances how recent, how often accessed, and how important each memory is
Memories earn their keep — three tiers (Peripheral → Working → Core); the more a memory gets used, the faster it promotes
Full version history — when you update a memory, the old version is preserved in a chain you can trace with memory_history

Maintenance

The agent maintains itself — recall results include inline hints about duplicates, dormant memories, and contradictions
Health checks on demand — memory_lint finds orphaned memories, stale entries, and missing links, then fixes what it can
Merge duplicates — memory_merge combines two redundant memories into one; originals are marked as superseded
See your memory space — memory_visualize generates an interactive HTML graph you can open in any browser

Visualization

Run memory_visualize to generate an interactive knowledge graph of your memory space:

Automatic clustering — related memories group together visually
Similarity edges, duplicate detection, importance sizing
Time filter, growth animation, cluster view
Self-contained HTML — open in any browser

<details> <summary>Scoring Pipeline (technical details)</summary>

Query → embedQuery() ─┐
                       ├─→ RRF Fusion → Rerank → Lifecycle Decay → Length Norm → Filter
Query → BM25 FTS ─────┘

Stage	Effect
RRF Fusion	Combines semantic and exact-match recall
Cross-Encoder Rerank	Promotes semantically precise hits
Lifecycle Decay	Weibull freshness + access frequency + importance
Length Normalization	Prevents long entries from dominating (anchor: 500 chars)
Hard Min Score	Removes irrelevant results (default: 0.35)
MMR Diversity	Cosine similarity > 0.85 → demoted

</details>

Configuration

Environment Variables

Variable	Required	Description
`EMBEDDING_API_KEY`	Yes	API key for embedding provider
`EMBEDDING_MODEL`	No	Model name (default: `text-embedding-3-small`)
`EMBEDDING_BASE_URL`	No	Custom base URL for non-OpenAI providers
`MEMORY_DB_PATH`	No	LanceDB storage directory
`MEMORY_LANCEDB_CONFIG`	No	Path to JSON config file

<details> <summary>Full configuration example</summary>

{
  "embedding": {
    "apiKey": "${EMBEDDING_API_KEY}",
    "model": "jina-embeddings-v5-text-small",
    "baseURL": "https://api.jina.ai/v1",
    "dimensions": 1024,
    "taskQuery": "retrieval.query",
    "taskPassage": "retrieval.passage",
    "normalized": true
  },
  "dbPath": "./memory-data",
  "retrieval": {
    "mode": "hybrid",
    "vectorWeight": 0.7,
    "bm25Weight": 0.3,
    "minScore": 0.3,
    "rerank": "cross-encoder",
    "rerankApiKey": "${JINA_API_KEY}",
    "rerankModel": "jina-reranker-v3",
    "rerankEndpoint": "https://api.jina.ai/v1/rerank",
    "rerankProvider": "jina",
    "candidatePoolSize": 20,
    "hardMinScore": 0.35,
    "filterNoise": true
  },
  "enableManagementTools": true,
  "enableSelfImprovementTools": false,
  "enableVisualizationTools": true,
  "scopes": {
    "default": "global",
    "definitions": {
      "global": { "description": "Shared knowledge" },
      "agent:my-bot": { "description": "Private to my-bot" }
    },
    "agentAccess": {
      "my-bot": ["global", "agent:my-bot"]
    }
  },
  "decay": {
    "recencyHalfLifeDays": 30,
    "frequencyWeight": 0.3,
    "intrinsicWeight": 0.3
  }
}

</details>

<details> <summary>Embedding providers</summary>

Works with any OpenAI-compatible embedding API:

Provider	Model	Base URL	Dimensions
OpenAI	`text-embedding-3-small`	`https://api.openai.com/v1`	1536
Jina	`jina-embeddings-v5-text-small`	`https://api.jina.ai/v1`	1024
DeepInfra	`Qwen/Qwen3-Embedding-8B`	`https://api.deepinfra.com/v1/openai`	1024
Google Gemini	`gemini-embedding-001`	`https://generativelanguage.googleapis.com/v1beta/openai/`	3072
Ollama (local)	`nomic-embed-text`	`http://localhost:11434/v1`	varies

</details>

<details> <summary>Rerank providers</summary>

Provider	`rerankProvider`	Endpoint	Example Model
Jina	`jina`	`https://api.jina.ai/v1/rerank`	`jina-reranker-v3`
Hugging Face TEI	`tei`	`http://host:8081/rerank`	`BAAI/bge-reranker-v2-m3`
SiliconFlow	`siliconflow`	`https://api.siliconflow.com/v1/rerank`	`BAAI/bge-reranker-v2-m3`
Voyage AI	`voyage`	`https://api.voyageai.com/v1/rerank`	`rerank-2.5`
Pinecone	`pinecone`	`https://api.pinecone.io/rerank`	`bge-reranker-v2-m3`
DashScope	`dashscope`	`https://dashscope.aliyuncs.com/api/v1/services/rerank`	`gte-rerank`

</details>

<details> <summary>Tools Reference</summary>

Core Tools

Tool	Description
`memory_recall`	Search memories — supports batch queries, relation expansion, topic filtering, and inline maintenance hints
`memory_store`	Save a memory — auto-links related ones, flags contradictions, infers topic, filters junk
`memory_forget`	Delete by ID or search query
`memory_update`	Update a memory; the old version is preserved in a version chain
`memory_merge`	Merge two memories into one
`memory_history`	Trace version history through update/merge chains

Management Tools (opt-in)

Tool	Description
`memory_stats`	Usage statistics by scope and category
`memory_list`	List recent memories with filtering
`memory_lint`	Health checks + auto-fix missing relations

Enable: "enableManagementTools": true

Self-Improvement Tools (opt-in)

Tool	Description
`self_improvement_log`	Log structured learning/error entries
`self_improvement_extract_skill`	Create skill scaffolds from learnings
`self_improvement_review`	Summarize governance backlog

Enable: "enableSelfImprovementTools": true

Visualization Tools (on by default)

Tool	Description
`memory_visualize`	Generate interactive HTML memory graph

Params: output_path, scope, threshold (default: 0.65), max_neighbors (default: 4)

Disable: "enableVisualizationTools": false

</details>

<details> <summary>Database Schema</summary>

LanceDB table memories:

Field	Type	Description
`id`	string (UUID)	Primary key
`text`	string	Memory text (FTS indexed)
`vector`	float[]	Embedding vector
`category`	string	`preference` / `fact` / `decision` / `entity` / `skill` / `lesson` / `other`
`scope`	string	Scope identifier
`importance`	float	Importance score 0-1
`timestamp`	int64	Creation timestamp (ms)
`metadata`	string (JSON)	Extended metadata (tier, access_count, relations, topic, etc.)

</details>

Development

git clone https://github.com/cablate/memory-lancedb-mcp.git
cd memory-lancedb-mcp
npm install
npm test

Run locally:

EMBEDDING_API_KEY=your-key npx tsx server.ts

Credits

Built on CortexReach/memory-lancedb-pro — original work by win4r and contributors.

License

MIT — see LICENSE for details.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured