Cognio

Cognio

Persistent semantic memory server for AI assistants via MCP, enabling long-term context retention and semantic search across conversations.

Category
Visit Server

README

Cognio

Persistent semantic memory server for AI assistants via Model Context Protocol (MCP)

CI/CD License: MIT Python 3.11+ FastAPI

Cognio is a Model Context Protocol (MCP) server that provides persistent semantic memory for AI assistants. Unlike ephemeral chat history, Cognio stores context permanently and enables semantic search across conversations.

Built for:

  • Personal knowledge base that grows over time
  • Multi-project context management
  • Research notes and learning journal
  • Conversation history with semantic retrieval

Features

  • Semantic Search: Find memories by meaning using sentence-transformers
  • LEANN Vector Search (Optional): Lazy-built index with on-demand recomputation to reduce startup memory
  • Multilingual Support: Search in 100+ languages seamlessly
  • Persistent Storage: SQLite-based storage that survives across sessions
  • Project Organization: Organize memories by project and tags
  • Auto-Tagging: Automatic tag generation via LLM (GPT-4, Groq, etc)
  • Text Summarization: Extractive and abstractive summarization for long texts
  • MCP Integration: One-click setup for VS Code, Claude, Cursor, and more
  • RESTful API: Standard HTTP API with OpenAPI documentation
  • Export Capabilities: Export to JSON or Markdown format
  • Docker Support: Simple deployment with docker-compose

Quick Start

1. Start the Server

git clone https://github.com/0xReLogic/Cognio.git
cd Cognio
docker-compose up -d

Server runs at http://localhost:8080

2. Auto-Configure AI Clients

The MCP server automatically configures supported AI clients on first start:

Supported Clients:

  • Claude Desktop
  • Claude Code (CLI)
  • VS Code (GitHub Copilot)
  • Cursor
  • Continue.dev
  • Cline
  • Windsurf
  • Kiro
  • Gemini CLI

Quick Setup:

Run the auto-setup script to configure all clients at once:

cd mcp-server
npm run setup

This generates MCP configs for all 9 supported clients automatically.

Manual Configuration:

See mcp-server/README.md for client-specific MCP configuration examples.

On first run, Cognio auto-generates cognio.md in your workspace with usage guide for AI tools.

3. Test It

# Save a memory
curl -X POST http://localhost:8080/memory/save \
  -H "Content-Type: application/json" \
  -d '{"text": "Docker allows running apps in containers", "project": "LEARNING"}'

# Search memories
curl "http://localhost:8080/memory/search?q=containers"

Or use naturally in your AI client:

"Search my memories for Docker information"
"Remember this: FastAPI is a modern Python web framework"

4. Web UI Dashboard

Access the interactive memory dashboard:

http://localhost:8080/ui

Features:

  • Browse and search all memories
  • Add/edit memories with markdown preview
  • View statistics and insights
  • Organize by project and tags
  • Bulk operations (select, delete)
  • Dark/light theme toggle
  • Works locally and in Docker

The dashboard auto-detects the API server, so it works on localhost, Docker containers, and remote deployments.

Documentation

MCP Tools

When using the MCP server, you have access to 11 specialized tools:

Tool Description
save_memory Save text with optional project/tags (auto-tagging enabled)
search_memory Semantic search with project filtering
list_memories List memories with pagination and filters
get_memory_stats Get storage statistics and insights
archive_memory Soft delete a memory (recoverable)
delete_memory Permanently delete a memory by ID
export_memories Export memories to JSON or Markdown
summarize_text Summarize long text (extractive or LLM-based)
set_active_project Set active project context (auto-applies to all operations)
get_active_project View currently active project
list_projects List all available projects from database

Active Project Workflow:

1. list_projects() → See: Helios-LoadBalancer (45), Cognio-Memory (23), ...
2. set_active_project("Helios-LoadBalancer")
3. save_memory("Cache TTL is 300s") → Auto-saves to Helios-LoadBalancer
4. search_memory("cache settings") → Auto-searches in Helios-LoadBalancer only
5. list_memories() → Lists only Helios-LoadBalancer memories

Project Isolation:
Always specify a project name OR use set_active_project to keep memories organized and prevent mixing contexts between different workspaces.

API Endpoints

Method Endpoint Description
GET /health Health check
POST /memory/save Save new memory
GET /memory/search Semantic/Hybrid search
GET /memory/list List memories with filters
DELETE /memory/{id} Delete memory by ID
POST /memory/bulk-delete Bulk delete by project
GET /memory/stats Get statistics
GET /memory/export Export memories
POST /memory/summarize Summarize long text

Interactive docs: http://localhost:8080/docs

Configuration

Environment variables (see .env.example):

Copy the example and edit your local overrides:

cp .env.example .env
# Database
DB_PATH=./data/memory.db

# Embeddings
EMBED_MODEL=all-MiniLM-L6-v2
EMBED_DEVICE=cpu
EMBEDDING_CACHE_PATH=./data/embedding_cache.pkl

# API
API_HOST=0.0.0.0
API_PORT=8080
# Optional API key for auth
API_KEY=your-secret-key

# Search
DEFAULT_SEARCH_LIMIT=5
SIMILARITY_THRESHOLD=0.4
HYBRID_ENABLED=true
HYBRID_MODE=rerank        # candidate | rerank
HYBRID_ALPHA=0.6          # 0..1, higher = more semantic
HYBRID_RERANK_TOPK=100    # rerank candidate pool size

# LEANN vector search (optional)
LEANN_ENABLED=false
LEANN_INDEX_PATH=./data/leann/memories.leann
LEANN_BACKEND=hnsw
LEANN_LAZY_BUILD=true
LEANN_RECOMPUTE_ON_SEARCH=true
LEANN_WARMUP_ON_START=false

# Summarization
SUMMARIZATION_ENABLED=true
SUMMARIZATION_METHOD=abstractive   # extractive | abstractive
SUMMARIZATION_EMBED_MODEL=all-MiniLM-L6-v2

# Auto-tagging (Optional)
AUTOTAG_ENABLED=true
LLM_PROVIDER=groq
GROQ_API_KEY=your-groq-key
GROQ_MODEL=openai/gpt-oss-120b
# OPENAI_API_KEY=your-openai-api-key
# OPENAI_MODEL=gpt-4o-mini

# Performance
MAX_TEXT_LENGTH=10000
BATCH_SIZE=32
SUMMARIZE_THRESHOLD=50

# Logging
LOG_LEVEL=info

Auto-Tagging Models:

  • openai/gpt-oss-120b - High quality
  • gpt-4o-mini - OpenAI, fast and cheap
  • llama-3.3-70b-versatile - Groq, balanced
  • llama-3.1-8b-instant - Groq, fastest

See .env.example for all available options and recommendations.

Project Structure

cognio/
├── src/                # Core application
│   ├── main.py         # FastAPI app
│   ├── config.py       # Environment config
│   ├── models.py       # Data schemas
│   ├── database.py     # SQLite operations
│   ├── embeddings.py   # Semantic search
│   ├── memory.py       # Memory CRUD
│   ├── autotag.py      # Auto-tagging
│   └── utils.py        # Helpers
│
├── mcp-server/         # MCP integration
│   ├── index.js        # MCP server
│   └── package.json    # Dependencies
│
├── scripts/            # Utilities
│   ├── setup-clients.js  # Auto-config AI clients
│   ├── backup.sh       # Database backup
│   └── migrate.py      # Schema migrations
│
├── tests/              # Test suite
├── docs/               # Documentation
└── examples/           # Usage examples

Development

# Install dependencies
poetry install

# Run tests
pytest

# Start development server
uvicorn src.main:app --reload

Tech Stack

  • Backend: Python 3.11+, FastAPI, Uvicorn
  • Database: SQLite with JSON support
  • Embeddings: sentence-transformers (paraphrase-multilingual-mpnet-base-v2, 768-dim)
  • MCP Server: Node.js, @modelcontextprotocol/sdk
  • Auto-Tagging: Api
  • Testing: pytest, pytest-asyncio, pytest-cov
  • Deployment: Docker, docker-compose

Performance

Operation Time Notes
Save memory ~20ms Including embedding
Search (1k memories) ~15ms Semantic similarity
Search (10k memories) ~50ms Still fast
Model load ~3s One-time on startup

License

MIT License - see LICENSE

Links


Built for better AI conversations

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured