context-saver
MCP proxy that reduces context usage through semantic tool routing, enabling on-demand discovery and routing of relevant tools.
README
context-saver
MCP proxy that reduces context usage through semantic tool routing.
The Problem
MCP tools consume massive amounts of context tokens before conversations even start:
| Server | Tools | Tokens |
|---|---|---|
| Notion | 14 | ~16,500 |
| Google Drive | 99 | ~18,000 |
| Chrome DevTools | 29 | ~5,800 |
| Total | 142 | ~40,300 |
That's 40k tokens gone before you ask a single question.
The Solution
context-saver sits between Claude Code and your MCP servers, using vector embeddings to surface only relevant tools on-demand.
Claude Code ──► context-saver ──► Backend MCP Servers
│
▼
LanceDB
(tool embeddings)
Results:
| Mode | Initial Tokens | Tools Available |
|---|---|---|
| Before | ~40,000 | All 142 |
| Standard | ~8,000 | All 142 |
| Lite | ~500 | All 142 (on-demand) |
Quick Start
1. Install
npm install -g context-saver
2. Create Config
Create ~/.context-saver/config.json:
{
"embedding": {
"provider": "openai",
"model": "text-embedding-3-small"
},
"discovery": {
"liteMode": true
},
"backends": {
"filesystem": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/user"]
}
}
}
3. Set API Key
export OPENAI_API_KEY="sk-..."
4. Add to Claude Code
Add to your Claude Code MCP settings (~/.claude/settings.json):
{
"mcpServers": {
"context-saver": {
"command": "npx",
"args": ["context-saver"]
}
}
}
5. Use It
In Claude Code, use discover_tools to find what you need:
> discover_tools("update notion pages")
Found 3 relevant tools:
1. notion-update-page (notion)
Update a Notion page's content
Parameters: page_id*, content*
Relevance: 94%
2. notion-fetch (notion)
Fetch a Notion page by ID
Parameters: page_id*
Relevance: 87%
...
Configuration
Full Example
{
"version": "1.0",
"embedding": {
"provider": "openai",
"model": "text-embedding-3-small",
"dimensions": 1536,
"apiKey": "${OPENAI_API_KEY}"
},
"storage": {
"path": "~/.context-saver/lancedb",
"reindexOnStart": false
},
"discovery": {
"defaultTopK": 5,
"minSimilarity": 0.3,
"liteMode": true
},
"backends": {
"notion": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@anthropic/mcp-server-notion"],
"env": {
"NOTION_API_KEY": "${NOTION_API_KEY}"
}
},
"google-drive": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@anthropic/mcp-server-google-drive"]
}
}
}
Options
embedding
| Option | Default | Description |
|---|---|---|
provider |
"openai" |
Embedding provider (see below) |
model |
varies | Model name |
dimensions |
varies | Embedding dimensions |
apiKey |
env var | API key (supports env var syntax) |
Supported Providers:
| Provider | Model | Dimensions | API Key |
|---|---|---|---|
openai |
text-embedding-3-small |
1536 | OPENAI_API_KEY |
gemini |
text-embedding-004 |
768 | GOOGLE_API_KEY |
cohere |
embed-english-v3.0 |
1024 | COHERE_API_KEY |
ollama |
nomic-embed-text |
768 | None (local) |
local |
Xenova/all-MiniLM-L6-v2 |
384 | None (local) |
Local embeddings (no API key needed):
{
"embedding": {
"provider": "local",
"model": "Xenova/all-MiniLM-L6-v2",
"dimensions": 384
}
}
discovery
| Option | Default | Description |
|---|---|---|
defaultTopK |
5 |
Default number of tools returned |
minSimilarity |
0.3 |
Minimum similarity threshold (0-1) |
liteMode |
false |
Maximum savings: only expose discover_tools initially |
storage
| Option | Default | Description |
|---|---|---|
path |
~/.context-saver/lancedb |
LanceDB storage location |
reindexOnStart |
false |
Force reindex on every startup |
backends
Each backend can be:
STDIO (local process):
{
"type": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/path"],
"env": { "KEY": "value" }
}
Remote (HTTP - coming soon):
{
"type": "remote",
"url": "https://mcp.example.com",
"headers": { "Authorization": "Bearer ..." }
}
Built-in Tools
context-saver exposes six meta-tools:
discover_tools
Semantic search for relevant tools.
discover_tools({ query: "search google drive", limit: 5 })
list_all_tools
List all available tools grouped by server.
list_all_tools()
tool_info
Get detailed information about a specific tool including full parameter schema.
tool_info({ tool_name: "notion-update-page" })
similar_tools
Find tools similar to one you already know.
similar_tools({ tool_name: "read_file", limit: 5 })
tools_by_category
List tools filtered by category.
tools_by_category({ category: "filesystem" })
Categories: filesystem, documents, spreadsheets, presentations, images, calendar, messaging, database, browser, version-control
server_stats
Get statistics about context-saver including connected backends, indexed tools, and usage stats.
server_stats()
Lite Mode
For maximum token savings, enable liteMode:
{
"discovery": {
"liteMode": true
}
}
In lite mode:
- Only
discover_toolsandlist_all_toolsare exposed initially (~500 tokens) - All backend tools are still available and routed correctly
- Use
discover_toolsto find what you need
How It Works
- Startup: Connects to all backend MCP servers and indexes their tools
- Indexing: Creates embeddings for each tool using OpenAI
- Storage: Stores embeddings in LanceDB for fast vector search
- Discovery: When you call
discover_tools, performs cosine similarity search - Routing: Tool calls are routed to the correct backend server
Development
git clone https://github.com/msuther898/context-saver.git
cd context-saver
npm install
npm run build
npm start
Project Structure
src/
├── index.ts # Entry point
├── server.ts # MCP server + handlers
├── client-pool.ts # Backend connections
├── config/ # Config types + loader
├── discovery/
│ ├── indexer.ts # Tool indexing with synonyms
│ └── search.ts # Vector search + re-ranking
├── embeddings/
│ ├── index.ts # Provider factory
│ ├── openai.ts # OpenAI embeddings
│ ├── gemini.ts # Google Gemini embeddings
│ ├── cohere.ts # Cohere embeddings
│ ├── ollama.ts # Ollama local embeddings
│ └── local.ts # Transformers.js embeddings
└── storage/
└── lancedb.ts # LanceDB vector storage
Roadmap
- [x] Ollama embeddings support
- [x] Local embeddings (transformers.js)
- [x] Gemini embeddings support
- [x] Cohere embeddings support
- [x] Usage tracking and popularity boosting
- [x] Re-ranking with multiple signals
- [x] Category-based tool filtering
- [ ] Remote HTTP backend support
- [ ] Tool result caching
- [ ] Persistent usage stats
License
MIT
Credits
Built by @msuther898 with Claude.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.