Context-Optimizer-MCP

Context-Optimizer-MCP

An MCP server suite that optimizes prompt context by reducing tokens up to 98.8%, acting as persistent long-term memory and codebase scanner to save API costs.

Category
Visit Server

README

Context-Optimizer-MCP

Python FastAPI React MCP Tests License

Cuts AI prompt context by 90–99% and reduces API costs by $70–$140 per 1,000 queries.

A local-first Model Context Protocol (MCP) server suite that gives AI coding assistants persistent long-term memory and high-speed codebase discovery — eliminating context amnesia and token waste across every session.


The Problem

Modern AI coding assistants (Claude, Cursor, Copilot) suffer from two compounding problems:

  • Context amnesia — Every new session starts from zero. Architectural decisions, past mistakes, and established patterns must be re-explained each time.
  • Token waste — To answer a simple question, the AI blindly reads thousands of lines of source code, burning tokens on irrelevant logic before finding anything useful.

Context-Optimizer-MCP solves both.


Benchmark Results

Tested against this codebase (21 source files, 58,808 raw tokens) across 15 diverse query types — from narrow configuration lookups to broad architectural questions.

Metric Value
Minimum context reduction 97.4%
Median context reduction 99.81%
Maximum context reduction 99.9%

Run it yourself:

python benchmark.py

Architecture

┌─────────────────────────────────────────────┐
│         AI Agent (Claude / Cursor)          │
└───────────────┬─────────────────────────────┘
                │ MCP Protocol
    ┌───────────┴────────────┐
    │                        │
┌───▼──────────┐    ┌────────▼────────┐
│ Memory       │    │ Discovery       │
│ MCP Server   │    │ MCP Server      │
│              │    │                 │
│ Stores and   │    │ AST + Regex     │
│ retrieves    │    │ codebase scan   │
│ decisions,   │    │ → endpoints,    │
│ mistakes,    │    │   queries,      │
│ observations │    │   tech debt     │
└──────┬───────┘    └────────┬────────┘
       │                     │
       └──────────┬──────────┘
                  │
          ┌───────▼────────┐          ┌──────────────────┐
          │  SQLite DB     │◄────────►│  ai-memory.yaml  │
          │  mcp_memory.db │  cli.py  │  (Git-tracked)   │
          └───────┬────────┘          └──────────────────┘
                  │
          ┌───────▼────────┐
          │ FastAPI +       │
          │ React Dashboard │
          └────────────────┘

Core Components

Memory MCP Server

Persistent SQLite-backed memory for AI agents. Stores decisions, mistakes, and observations with full lifecycle management.

  • Semantic deduplication — Embeds incoming memories (Gemini text-embedding-004 / OpenAI text-embedding-3-small) and runs cosine similarity in RAM. Similarity ≥ 0.85 triggers a merge instead of a new insert, incrementing the existing memory's confidence score. Falls back to exact-string matching when no API key is present.
  • Staleness tracking — Classifies memories as fresh (<30 days), warming (30–90 days), or stale (>90 days) based on last validation timestamp. Auto-migrates older databases on startup.
  • Memory pruningmem_prune deletes unreinforced one-off entries (confidence == 1.0) older than N days, with dry-run mode on by default.

Discovery MCP Server

Scans codebases structurally using AST parsing and regex — extracts API endpoints, database queries, class/function maps, and # TODO debt markers without reading implementation logic line-by-line.

Produces a lightweight "blueprint" of the project that the AI can query in ~150 tokens instead of reading the full source.

Context Engine (FastAPI)

The search backend bridging agents and storage.

  • Dual-mode semantic search — Uses vector embeddings when API keys are present; falls back to TF-IDF + cosine similarity for fully offline, zero-setup retrieval.
  • Context compression — Retrieves, ranks, and compresses relevant memories and code structures before passing them to the LLM.

React Dashboard

Local UI for auditing the AI's memory state.

  • Trigger codebase scans manually
  • Search memories with Google-style queries
  • Verify/refresh warming and stale memory cards with a one-click checkmark (✓)
  • View AST blueprints of the current project structure

Key Design Decisions

Cross-agent portability. Memory is stored in open SQLite — no vendor lock-in. Switch from Claude to Gemini tomorrow; the new agent inherits the full project history instantly.

Git-friendly memory sync. The binary .db file is not committed directly. cli.py export converts it to a human-readable ai-memory.yaml. Teams commit the YAML, and cli.py import --merge rebuilds the database on each machine using the semantic dedup engine to resolve conflicts rather than overwriting.

Zero mandatory dependencies. No API key required to run. Semantic search degrades gracefully to TF-IDF offline mode. The whole system works on an air-gapped machine.


Installation

Requirements: Python 3.10+, Node.js (only needed if rebuilding the dashboard; precompiled build included)

# 1. Clone and install
git clone https://github.com/your-username/Context-Optimizer-MCP.git
cd Context-Optimizer-MCP
pip install -r requirements.txt

# 2. Configure environment (API keys optional)
cp .env.template .env
# Add GEMINI_API_KEY or OPENAI_API_KEY to enable semantic search
# Leave blank for offline TF-IDF mode

# 3. Import memory from Git history
python cli.py import

# 4. Start the dashboard
python context_engine/server.py
# Open http://127.0.0.1:8000

Connecting to AI Clients

Cursor

Settings → Cursor Settings → Features → MCP → + Add New MCP Server

Field Value
Name memory-server
Type command
Command python -u "C:/path/to/Context-Optimizer-MCP/mcp_servers/memory_server.py"

Repeat for discovery-server using discovery_server.py.

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "codebase-memory": {
      "command": "python",
      "args": ["C:/path/to/Context-Optimizer-MCP/mcp_servers/memory_server.py"]
    },
    "codebase-discovery": {
      "command": "python",
      "args": ["C:/path/to/Context-Optimizer-MCP/mcp_servers/discovery_server.py"]
    }
  }
}

CLI Reference

# Export SQLite → YAML (for Git)
python cli.py export

# Rebuild SQLite from YAML
python cli.py import

# Merge YAML into existing DB (semantic dedup on conflicts)
python cli.py import --merge

# Preview stale memories eligible for pruning (dry run)
python cli.py prune --days 90 --confidence 1.0

# Execute pruning
python cli.py prune --days 90 --confidence 1.0 --execute

Tests

python -m unittest tests/test_memory_discovery.py

10 integration tests covering deduplication logic, YAML sync, staleness scoring, pruning, and benchmark validation. All passing.


Tech Stack

Python · FastAPI · SQLite · React · Vite · Model Context Protocol (MCP) · Google Gemini Embeddings · OpenAI Embeddings · TF-IDF / Cosine Similarity


License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured