MCP Servers

Context-Optimizer-MCP

An MCP server suite that optimizes prompt context by reducing tokens up to 98.8%, acting as persistent long-term memory and codebase scanner to save API costs.

README

Context-Optimizer-MCP

Cuts AI prompt context by 90–99% and reduces API costs by $70–$140 per 1,000 queries.

A local-first Model Context Protocol (MCP) server suite that gives AI coding assistants persistent long-term memory and high-speed codebase discovery — eliminating context amnesia and token waste across every session.

The Problem

Modern AI coding assistants (Claude, Cursor, Copilot) suffer from two compounding problems:

Context amnesia — Every new session starts from zero. Architectural decisions, past mistakes, and established patterns must be re-explained each time.
Token waste — To answer a simple question, the AI blindly reads thousands of lines of source code, burning tokens on irrelevant logic before finding anything useful.

Context-Optimizer-MCP solves both.

Benchmark Results

Tested against this codebase (21 source files, 58,808 raw tokens) across 15 diverse query types — from narrow configuration lookups to broad architectural questions.

Metric	Value
Minimum context reduction	97.4%
Median context reduction	99.81%
Maximum context reduction	99.9%

Run it yourself:

python benchmark.py

Architecture

┌─────────────────────────────────────────────┐
│         AI Agent (Claude / Cursor)          │
└───────────────┬─────────────────────────────┘
                │ MCP Protocol
    ┌───────────┴────────────┐
    │                        │
┌───▼──────────┐    ┌────────▼────────┐
│ Memory       │    │ Discovery       │
│ MCP Server   │    │ MCP Server      │
│              │    │                 │
│ Stores and   │    │ AST + Regex     │
│ retrieves    │    │ codebase scan   │
│ decisions,   │    │ → endpoints,    │
│ mistakes,    │    │   queries,      │
│ observations │    │   tech debt     │
└──────┬───────┘    └────────┬────────┘
       │                     │
       └──────────┬──────────┘
                  │
          ┌───────▼────────┐          ┌──────────────────┐
          │  SQLite DB     │◄────────►│  ai-memory.yaml  │
          │  mcp_memory.db │  cli.py  │  (Git-tracked)   │
          └───────┬────────┘          └──────────────────┘
                  │
          ┌───────▼────────┐
          │ FastAPI +       │
          │ React Dashboard │
          └────────────────┘

Core Components

Memory MCP Server

Persistent SQLite-backed memory for AI agents. Stores decisions, mistakes, and observations with full lifecycle management.

Semantic deduplication — Embeds incoming memories (Gemini text-embedding-004 / OpenAI text-embedding-3-small) and runs cosine similarity in RAM. Similarity ≥ 0.85 triggers a merge instead of a new insert, incrementing the existing memory's confidence score. Falls back to exact-string matching when no API key is present.
Staleness tracking — Classifies memories as fresh (<30 days), warming (30–90 days), or stale (>90 days) based on last validation timestamp. Auto-migrates older databases on startup.
Memory pruning — mem_prune deletes unreinforced one-off entries (confidence == 1.0) older than N days, with dry-run mode on by default.

Discovery MCP Server

Scans codebases structurally using AST parsing and regex — extracts API endpoints, database queries, class/function maps, and # TODO debt markers without reading implementation logic line-by-line.

Produces a lightweight "blueprint" of the project that the AI can query in ~150 tokens instead of reading the full source.

Context Engine (FastAPI)

The search backend bridging agents and storage.

Dual-mode semantic search — Uses vector embeddings when API keys are present; falls back to TF-IDF + cosine similarity for fully offline, zero-setup retrieval.
Context compression — Retrieves, ranks, and compresses relevant memories and code structures before passing them to the LLM.

React Dashboard

Local UI for auditing the AI's memory state.

Trigger codebase scans manually
Search memories with Google-style queries
Verify/refresh warming and stale memory cards with a one-click checkmark (✓)
View AST blueprints of the current project structure

Key Design Decisions

Cross-agent portability. Memory is stored in open SQLite — no vendor lock-in. Switch from Claude to Gemini tomorrow; the new agent inherits the full project history instantly.

Git-friendly memory sync. The binary .db file is not committed directly. cli.py export converts it to a human-readable ai-memory.yaml. Teams commit the YAML, and cli.py import --merge rebuilds the database on each machine using the semantic dedup engine to resolve conflicts rather than overwriting.

Zero mandatory dependencies. No API key required to run. Semantic search degrades gracefully to TF-IDF offline mode. The whole system works on an air-gapped machine.

Installation

Requirements: Python 3.10+, Node.js (only needed if rebuilding the dashboard; precompiled build included)

# 1. Clone and install
git clone https://github.com/your-username/Context-Optimizer-MCP.git
cd Context-Optimizer-MCP
pip install -r requirements.txt

# 2. Configure environment (API keys optional)
cp .env.template .env
# Add GEMINI_API_KEY or OPENAI_API_KEY to enable semantic search
# Leave blank for offline TF-IDF mode

# 3. Import memory from Git history
python cli.py import

# 4. Start the dashboard
python context_engine/server.py
# Open http://127.0.0.1:8000

Connecting to AI Clients

Cursor

Settings → Cursor Settings → Features → MCP → + Add New MCP Server

Field	Value
Name	`memory-server`
Type	`command`
Command	`python -u "C:/path/to/Context-Optimizer-MCP/mcp_servers/memory_server.py"`

Repeat for discovery-server using discovery_server.py.

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "codebase-memory": {
      "command": "python",
      "args": ["C:/path/to/Context-Optimizer-MCP/mcp_servers/memory_server.py"]
    },
    "codebase-discovery": {
      "command": "python",
      "args": ["C:/path/to/Context-Optimizer-MCP/mcp_servers/discovery_server.py"]
    }
  }
}

CLI Reference

# Export SQLite → YAML (for Git)
python cli.py export

# Rebuild SQLite from YAML
python cli.py import

# Merge YAML into existing DB (semantic dedup on conflicts)
python cli.py import --merge

# Preview stale memories eligible for pruning (dry run)
python cli.py prune --days 90 --confidence 1.0

# Execute pruning
python cli.py prune --days 90 --confidence 1.0 --execute

Tests

python -m unittest tests/test_memory_discovery.py

10 integration tests covering deduplication logic, YAML sync, staleness scoring, pruning, and benchmark validation. All passing.

Tech Stack

Python · FastAPI · SQLite · React · Vite · Model Context Protocol (MCP) · Google Gemini Embeddings · OpenAI Embeddings · TF-IDF / Cosine Similarity

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured