MCP Servers

Cortex

Private local memory for AI tools that provides persistent, private memory via HTTP and MCP protocols, enabling tools to store and recall context without external services.

README

<h1 align="center">Cortex</h1> Private local memory for your AI tools. Install once. Your tools stop starting from scratch.

<a href="https://github.com/AdityaVG13/cortex/releases/latest">Download</a> · <a href="Info/connecting.md">Connect your tools</a> · <a href="CHANGELOG.md">What's new</a> · <a href="Info/roadmap.md">Roadmap</a>

🔒 Private by default: localhost only, data never leaves your machine 🔗 One memory, every tool: HTTP and MCP, same brain, no per-tool silos 📊 Prove it works: token savings, recall quality, and Monte Carlo projections

Quick Start

Get to the first memory moment before learning daemon internals.

1. Install or build Cortex

Use the latest desktop installer, or build the 0.6.0 source CLI:

git clone https://github.com/AdityaVG13/cortex.git
cd cortex/daemon-rs
cargo build --release

2. Start local memory

Open Cortex Control Center and start Cortex from the app. CLI-only users can run:

cortex serve

3. Check readiness

cortex status --json

Success is "status": "ready". If status is needs_action or error, follow the returned nextAction / repair before continuing.

4. Connect one AI tool

Claude Code:

claude plugin marketplace add AdityaVG13/cortex
claude plugin install cortex@cortex-marketplace

Codex:

codex mcp add cortex -- cortex.exe mcp --agent codex

Restart the AI tool after changing MCP config.

5. Store and recall one memory

From a connected MCP client, call cortex_store, then cortex_recall. From the repo, run the matching smoke script:

Windows:

powershell -ExecutionPolicy Bypass -File scripts\first-run-smoke.ps1

macOS / Linux:

bash scripts/first-run-smoke.sh

That smoke checks status, stores one disposable local memory, and recalls it. Normal use does not require benchmark adapters, provider keys, or LongMemEval.

More tool-specific setup: Info/connecting.md.

<table> <tr> <td align="center" valign="top"> <img width="400" height="1" src="https://raw.githubusercontent.com/AdityaVG13/cortex/master/assets/spacer.png"> <h3>❌ Without Cortex</h3> Session 1 → explain preferences Session 2 → explain them again Session 3 → and again, new tool Session 14 → still explaining ~15,000 tokens wasted </td> <td align="center" valign="top"> <img width="400" height="1" src="https://raw.githubusercontent.com/AdityaVG13/cortex/master/assets/spacer.png"> <h3>✅ With Cortex</h3> Session 1 → store once Session 2 → boot, already knows Session 3 → boot, already knows Session 14 → boot, still knows ~300 tokens per boot (97% less) </td> </tr> </table>

POST /store

Save decisions, lessons, preferences. Conflict detection is automatic.

</td> <td align="center" width="33%">

GET /recall

Hybrid keyword + semantic search. In-process ONNX embeddings, no external service.

</td> <td align="center" width="33%">

GET /boot

Compiled identity + delta capsule. ~300 tokens served instead of ~15,000 raw.

</td> </tr> </table>

Memory tools are easy to pitch and hard to trust. Cortex starts to matter when the savings stop looking theoretical.

📊 Analytics <img src="assets/grid-control-center-analytics.png" width="100%"> Savings, compression, and activity heatmaps

</td> <td width="50%">

📈 Monte Carlo <img src="assets/grid-cc-monte-carlo.png" width="100%"> 30-day projection with confidence bands

</td> </tr> <tr> <td width="50%">

🤖 Agents <img src="assets/grid-cc-agents.png" width="100%"> Live sessions, inbox, deduped by identity

</td> <td width="50%">

🎛️ Overview <img src="assets/grid-cc-overview.png" width="100%"> Memory counts, health, and navigation

</td> </tr> </table>

Benchmark note: <code>cortex-http-pure</code> is a benchmark adapter only; it is not required for normal Cortex operation. Scored LongMemEval-S validation is deferred until project budget allows, so v0.6.x does not claim a LongMemEval quality lift.

Historical v0.5.0 numbers below were measured against a 20-query ground-truth dataset via the <code>cortex-http-base</code> adapter. New recall-quality claims use the helper-free <code>cortex-http-pure</code> adapter as the canonical core baseline after funded validation.

<table align="center"> <tr> <th></th> <th align="center">v0.4.1</th> <th align="center">v0.5.0</th> <th align="center">Δ</th> </tr> <tr> <td align="center">Precision</td> <td align="center">55.2%</td> <td align="center">87.5%</td> <td align="center">📈 +32.3%</td> </tr> <tr> <td align="center">MRR</td> <td align="center">69.2%</td> <td align="center">95.0%</td> <td align="center">📈 +25.8%</td> </tr> <tr> <td align="center">Top-1 hit</td> <td align="center">90.0%</td> <td align="center">90.0%</td> <td align="center">—</td> </tr> <tr> <td align="center">Avg query tokens</td> <td align="center">n/a</td> <td align="center">48.4</td> <td align="center">—</td> </tr> </table>

<a href="benchmarking/results/raw-recall-no-helper-dev-20260421-224217.json">Raw v0.5.0 JSON</a> Note: <code>cortex-http-base</code> ("raw") adapter retains partial adapter-layer helpers and is deprecated for new quality claims. The helper-free <code>cortex-http-pure</code> adapter is the v0.6.0+ canonical measurement floor, enforced by 5 CI purity gates. See <a href="benchmarking/README.md">benchmarking/README.md</a>. Reranking ships default-off behind off/shadow/primary modes; public promotion remains gated on LongMemEval/API-backed validation. Query expansion (HyDE) is targeted for v0.7.0.

v0.6.0 makes settings, governance, boot audits, and recall-quality measurement first-class. Full details in <a href="CHANGELOG.md">CHANGELOG.md</a>.

Accessibility and settings

Settings panel: Accessibility, Appearance & Motion, Connection, Budgets, and Keyboard & Navigation
Runtime preferences: high contrast, reduced motion, keyboard hints, and compact navigation
Accessibility gates: stronger focus states, ARIA/live regions, contrast checks, and 375px reflow checks

Governance

Retention classes across store, MCP, OpenAPI, export, and import
Local endpoint budgets with stable HTTP 429 / JSON-RPC denial metadata
Budget UI in Control Center, backed by the local budgets.toml
Boot audits plus GET /boot/audit and the read-only cortex_boot_audit MCP tool
Admin rollback with dry-run/apply workflow and audit events

Recall quality

cortex-http-pure adapter as the canonical helper-free measurement floor
Purity gates, CAS-100, and triangle judge tooling for safer quality claims
bge-base-en-v1.5 default embeddings with MiniLM profiles and qwen3-embedding-0.6b opt-in
Cross-encoder reranking behind off/shadow/primary modes; default remains off

Reliability

Claude plugin MCP is attach-only and no longer starts a second daemon from plugin MCP paths
Control Center supervises the app-managed daemon and honors intentional stops
Handler panics return JSON 500 responses, with local panic breadcrumbs
Storage hygiene compacts FTS, prunes stale embeddings, and migrates canonical vectors to PQ8 int8 blobs

Cortex tracks active agent sessions when clients identify themselves through <code>cortex_boot</code> or <code>GET /boot?agent=NAME</code>.

Connected agents in Control Center

</td> <td width="45%" valign="top">

Multi-agent, one brain

Each boot call registers a session. Control Center shows active sessions, deduplicated by agent identity.
Read-path tools (recall, peek, unfold) reattach to existing sessions. No duplicates.
Session descriptions preserved across reconnects and daemon restarts.
What one agent stores, every other agent can recall.

Claude Code, Codex, Cursor, and custom scripts can all be connected simultaneously. Each tracks its own session while sharing the same memory.

</td> </tr> </table>

Tool	Connection	Setup
Claude Code	MCP (plugin) or desktop app	Plugin: `claude plugin install cortex@cortex-marketplace`
Codex	MCP	`codex mcp add cortex -- cortex.exe mcp --agent codex`
Cursor	MCP	Point MCP server at `cortex mcp --agent cursor`
Factory Droid	MCP	`cortex mcp --agent droid`
Aider	CLI / HTTP	`cortex boot --agent aider`
Custom tools	HTTP	Three endpoints: `/boot`, `/recall`, `/store`
Local LLMs	HTTP / MCP	Same protocol, any runtime

</div>

Full setup guide: <a href="Info/connecting.md">Info/connecting.md</a>

Desktop app (Control Center) Download from the <a href="https://github.com/AdityaVG13/cortex/releases/latest">latest tagged release page</a>. The Control Center manages daemon lifecycle for you.

Platform	Desktop installer	Daemon archive
Windows	`.exe` (NSIS installer)	`.zip`
macOS	`.dmg`	`.tar.gz`
Linux	`.AppImage` / `.deb`	`.tar.gz`

</div>

Current release: <code>v0.6.0</code>.

From source

git clone https://github.com/AdityaVG13/cortex.git
cd cortex/daemon-rs
cargo build --release

Claude Code plugin

claude plugin marketplace add AdityaVG13/cortex
claude plugin install cortex@cortex-marketplace

The plugin attaches to a running Cortex runtime. If Cortex is not ready, it reports <code>APP_INIT_REQUIRED</code>; open Control Center or start the local runtime, then retry.

Cortex enforces a single-daemon invariant: only one daemon process runs at a time.

Mode	How it works
Desktop app	Control Center owns the daemon. Restart and monitor from the app.
CLI	`cortex serve` starts the daemon. Exits cleanly if one is already running.
Plugin	Attach-only MCP bridge. It connects to the running app/service daemon and does not silently spawn a second daemon.

</div>

Default bind: <code>127.0.0.1:7437</code>. Non-loopback binds require TLS. Auth token at <code>~/.cortex/cortex.token</code>. If using the Control Center, manage the daemon from there. Do not run a second <code>cortex serve</code> alongside it.

After installing, verify the product path:

cortex status --json

Windows:

powershell -ExecutionPolicy Bypass -File scripts\first-run-smoke.ps1

macOS / Linux:

bash scripts/first-run-smoke.sh

<details> <summary>Development build verification</summary>

# Daemon unit tests
cargo test --manifest-path daemon-rs/Cargo.toml

# Desktop test suite
npm --prefix desktop/cortex-control-center test

# Lifecycle smoke test
npm --prefix desktop/cortex-control-center run verify:lifecycle:dev

# Security audit
npm audit --omit=dev --audit-level=high
cargo audit

</details>

Document	Covers
Connecting	Setup, MCP, HTTP, auth, troubleshooting
Architecture	Codebase map, entry points, data flow, config, tests
MCP Tools	All 29 MCP tool definitions and parameters
Research	Papers, inspirations, adaptation notes
Roadmap	What shipped, what's planned, and why
Security	Threat model, auth rules, vulnerability reporting
Team mode	Shared-server setup for engineering teams
Contributing	Development setup and PR guidelines

</div>

<details> <summary>CLI reference</summary>

Command	Description
`cortex serve`	Start the daemon
`cortex --help`	Full command reference
`cortex doctor`	Run diagnostics
`cortex paths --json`	Show file and port paths
`cortex plugin ensure-daemon`	Ensure daemon health (plugin mode)
`cortex plugin mcp`	MCP stdio bridge to HTTP API
`cortex setup --team`	Initialize team mode and generate API keys
`cortex export`	Export data (json or sql)
`cortex import`	Import from a previous export
`cortex admin rollback --session-id <id>`	Soft-delete a session's memory writes (dry-run default; `--apply` to persist)

</details>

Cortex defaults to localhost-only access with bearer-token auth. Full threat model, auth rules, and vulnerability reporting: <a href="Info/security-rules.md">Info/security-rules.md</a>

<details> <summary>💾 How much disk space does Cortex use?</summary> The daemon binary is ~30 MB. The SQLite database grows with usage. A real install with 286 memories and 493 decisions uses ~386 MB after compaction. The ONNX embedding model (~50 MB) downloads on first run. </details>

<details> <summary>🤖 Can multiple agents write to Cortex at the same time?</summary> Yes. SQLite WAL mode handles concurrent reads and serialized writes. Each agent maintains its own session while sharing the same memory. Conflict detection handles contradictions automatically. </details>

<details> <summary>🔒 Does Cortex send any data externally?</summary> No. In solo mode, Cortex runs entirely on localhost. No telemetry, no phone-home, no cloud sync. Team mode sends data only to the configured team server over your network. </details>

<details> <summary>🔄 What happens if the daemon crashes mid-session?</summary> The MCP proxy detects daemon death and restarts automatically (bounded to 3 attempts with backoff). SQLite WAL mode ensures no data corruption. Sessions survive transient crashes. </details>

<details> <summary>🧹 How do I reset Cortex to a clean state?</summary> Delete <code>~/.cortex/cortex.db</code> and restart the daemon. A new empty database and auth token are generated. Settings and model files are preserved. </details>

Built by

<a href="https://ko-fi.com/adityavg13">☕ Support Cortex</a> · <a href="Info/research.md">Research</a> · <a href="Info/connecting.md">Connecting</a> · <a href="Info/security-rules.md">Security</a> · <a href="CONTRIBUTING.md">Contributing</a> · <a href="CODE_OF_CONDUCT.md">Code of Conduct</a> · <a href="CHANGELOG.md">Changelog</a> · <a href="LICENSE">MIT License</a>

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured