HKC Memory Server
An MCP server for managing persistent AI memory using hybrid search (keyword + semantic vector) with SQLite storage and offline-first local embeddings.
README
HKC Memory Server
A high-performance Model Context Protocol (MCP) server for managing persistent AI memory using hybrid search (keyword + semantic vector search). This server provides a sophisticated memory management system for AI assistants, enabling them to store, retrieve, and organize contextual information efficiently.
Features
- Hybrid Search: Combines SQLite FTS5 (full-text search) with semantic vector embeddings for optimal retrieval
- UCMF Format: Ultra-Compact Memory Format for efficient memory serialization
- Persistent Storage: SQLite database with optimized WAL mode
- Offline-First: Uses local embedding models (sentence-transformers) with no network dependency
- Memory Management: Tools for saving, retrieving, optimizing, and pruning memories
- Confidence Scoring: Weighted importance system for memory prioritization
Requirements
System Requirements
- Python 3.8 or higher
- SQLite 3.x (with FTS5 support)
- 2GB+ RAM recommended for embedding model
Python Dependencies
All dependencies are listed in requirements.txt:
mcp
fastmcp
aiofiles
langchain-huggingface
sentence-transformers
numpy
Embedding Model
The server uses sentence-transformers/all-MiniLM-L6-v2 (384-dimensional embeddings). The model will be automatically downloaded on first run to:
- Windows:
%USERPROFILE%\.cache\huggingface\hub\ - Linux/Mac:
~/.cache/huggingface/hub/
Alternatively, you can:
- Set
HKC_EMBED_MODEL_PATHenvironment variable to point to a local model directory - Place the model in a
./models/all-MiniLM-L6-v2/folder relative to the server script
Installation
1. Clone or Download
git clone <your-repo-url>
cd hkc-memory-server
2. Create Virtual Environment (Recommended)
# Windows
python -m venv .venv
.venv\Scripts\activate
# Linux/Mac
python3 -m venv .venv
source .venv/bin/activate
3. Install Dependencies
pip install -r requirements.txt
4. Download Embedding Model (Optional - Pre-download)
python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')"
LM Studio MCP Configuration
To use this server with LM Studio, add the following configuration to your LM Studio MCP settings JSON file:
Windows Configuration
{
"mcpServers": {
"hkc-memory": {
"command": "C:\\path\\to\\your\\.venv\\Scripts\\python.exe",
"args": [
"C:\\path\\to\\your\\hkc-memory-server\\hkc_memory_server.py"
],
"env": {
"VIRTUAL_ENV": "C:\\path\\to\\your\\.venv"
}
}
}
}
Linux/Mac Configuration
{
"mcpServers": {
"hkc-memory": {
"command": "/path/to/your/.venv/bin/python",
"args": [
"/path/to/your/hkc-memory-server/hkc_memory_server.py"
],
"env": {
"VIRTUAL_ENV": "/path/to/your/.venv"
}
}
}
}
Important: Replace the paths with your actual installation paths.
Usage
Starting the Server Manually
python hkc_memory_server.py
The server will:
- Initialize the SQLite database (
memory_index.db) - Create necessary directories (
backups/,conversations/) - Generate the UCMF legend file if not present
- Start the MCP server and wait for connections
Available MCP Tools
1. save_memory
Saves a new memory to the database.
Parameters:
memory_detail(string): The content of the memorycategory(string): Category tag (e.g., "preferences", "goals", "relationships")importance(float): Confidence score between 0.0 and 1.0reasoning(string, optional): Why this memory is importantwho(string, optional): Person associated with the memory (default: "User")
Example:
save_memory(
memory_detail="Prefers dark mode for all applications",
category="preferences",
importance=0.8,
reasoning="User explicitly mentioned multiple times"
)
2. retrieve_memories
Retrieves relevant memories using hybrid search.
Parameters:
context_keywords(list of strings): Keywords for semantic searchcategories(list of strings, optional): Filter by categorieslimit(int, optional): Maximum results to return (default: 10)
Example:
retrieve_memories(
context_keywords=["dark mode", "UI preferences"],
categories=["preferences"],
limit=5
)
3. pack_context
Retrieves memories and formats them in UCMF (Ultra-Compact Memory Format) for system prompts.
Parameters:
context_keywords(list of strings): Keywords for semantic searchcategories(list of strings, optional): Filter by categoriesmax_lines(int, optional): Maximum memory lines to include (default: 40)
Example:
pack_context(
context_keywords=["user preferences"],
max_lines=20
)
4. get_memory_stats
Returns statistics about the memory store.
Returns: JSON with total fact count, counts by type, and database path.
5. optimize_memories
Prunes low-confidence memories to keep the database clean.
Parameters:
aggressive(bool, optional): If True, removes memories with confidence < 0.3; otherwise < 0.1
Example:
optimize_memories(aggressive=False)
UCMF Format
Memories are stored in an Ultra-Compact Memory Format (UCMF) with the following fields:
id|t|who|what|why|whn|whr|tags|c
- id: Unique identifier (hash)
- t: Type (P=person, Proj=project, pref=preference, int=interest, rel=relationship, goal=goal, note=note, Σ=summary)
- who: Person/entity associated with the memory
- what: Description of the memory
- why: Reasoning/importance
- whn: When (timestamp in YYYYMMDD or YYYYMMDDTHHMM format, '-' if unknown)
- whr: Where (location, '-' if unknown)
- tags: Comma-separated tags
- c: Confidence score (0.0 to 1.0)
Database Structure
The server creates and manages the following SQLite tables:
- facts: Main memory storage (normalized structure)
- ucmf: Compact UCMF format for each memory
- facts_fts: Full-text search index (FTS5)
- vectors: Semantic embedding vectors (384-dim)
- meta: Server metadata and migration flags
Configuration
Environment Variables
HKC_EMBED_MODEL_PATH: Override the embedding model pathHF_HOME: Hugging Face cache directoryHF_HUB_OFFLINE: Set to "1" for offline mode (default in this server)TRANSFORMERS_OFFLINE: Set to "1" for offline mode (default in this server)
Troubleshooting
"Model not found" Error
The embedding model hasn't been downloaded. Either:
- Run the pre-download command in the installation section
- Allow internet access on first run for automatic download
- Manually download and specify
HKC_EMBED_MODEL_PATH
Database Lock Errors
The server uses WAL mode which should prevent most lock issues. If problems persist:
- Ensure no other processes are accessing
memory_index.db - Check file permissions
- Delete
memory_index.db-walandmemory_index.db-shmfiles and restart
Memory/Performance Issues
- The embedding model requires ~200MB RAM
- Consider using
optimize_memories()regularly to prune low-confidence entries - SQLite databases over 1GB may benefit from periodic
VACUUMoperations
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Submit a pull request with clear description of changes
License
[Add your chosen license here]
Acknowledgments
- Built on FastMCP framework
- Uses sentence-transformers for embeddings
- Implements the Model Context Protocol (MCP) specification
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.