Memory Cache Server
A Model Context Protocol (MCP) server that optimizes token usage by caching data during language model interactions, compatible with any language model and MCP client.
Tools
store_data
Store data in the cache with optional TTL
retrieve_data
Retrieve data from the cache
clear_cache
Clear specific or all cache entries
get_cache_stats
Get cache statistics
README
Memory Cache Server
A Model Context Protocol (MCP) server that reduces token consumption by efficiently caching data between language model interactions. Works with any MCP client and any language model that uses tokens.
Installation
Installing via Smithery
To install Memory Cache Server for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @tosin2013/mcp-memory-cache-server --client claude
Installing Manually
- Clone the repository:
git clone https://github.com/tosin2013/mcp-memory-cache-server.git
cd mcp-memory-cache-server
- Install dependencies:
npm install
- Build the project:
npm run build
- Add to your MCP client settings:
{
"mcpServers": {
"memory-cache": {
"command": "node",
"args": ["/path/to/ib-mcp-cache-server/build/index.js"]
}
}
}
- The server will automatically start when you use your MCP client
Verifying It Works
When the server is running properly, you'll see:
- A message in the terminal: "Memory Cache MCP server running on stdio"
- Improved performance when accessing the same data multiple times
- No action required from you - the caching happens automatically
You can verify the server is running by:
- Opening your MCP client
- Looking for any error messages in the terminal where you started the server
- Performing operations that would benefit from caching (like reading the same file multiple times)
Configuration
The server can be configured through config.json or environment variables:
{
"maxEntries": 1000, // Maximum number of items in cache
"maxMemory": 104857600, // Maximum memory usage in bytes (100MB)
"defaultTTL": 3600, // Default time-to-live in seconds (1 hour)
"checkInterval": 60000, // Cleanup interval in milliseconds (1 minute)
"statsInterval": 30000 // Stats update interval in milliseconds (30 seconds)
}
Configuration Settings Explained
-
maxEntries (default: 1000)
- Maximum number of items that can be stored in cache
- Prevents cache from growing indefinitely
- When exceeded, oldest unused items are removed first
-
maxMemory (default: 100MB)
- Maximum memory usage in bytes
- Prevents excessive memory consumption
- When exceeded, least recently used items are removed
-
defaultTTL (default: 1 hour)
- How long items stay in cache by default
- Items are automatically removed after this time
- Prevents stale data from consuming memory
-
checkInterval (default: 1 minute)
- How often the server checks for expired items
- Lower values keep memory usage more accurate
- Higher values reduce CPU usage
-
statsInterval (default: 30 seconds)
- How often cache statistics are updated
- Affects accuracy of hit/miss rates
- Helps monitor cache effectiveness
How It Reduces Token Consumption
The memory cache server reduces token consumption by automatically storing data that would otherwise need to be re-sent between you and the language model. You don't need to do anything special - the caching happens automatically when you interact with any language model through your MCP client.
Here are some examples of what gets cached:
1. File Content Caching
When reading a file multiple times:
- First time: Full file content is read and cached
- Subsequent times: Content is retrieved from cache instead of re-reading the file
- Result: Fewer tokens used for repeated file operations
2. Computation Results
When performing calculations or analysis:
- First time: Full computation is performed and results are cached
- Subsequent times: Results are retrieved from cache if the input is the same
- Result: Fewer tokens used for repeated computations
3. Frequently Accessed Data
When the same data is needed multiple times:
- First time: Data is processed and cached
- Subsequent times: Data is retrieved from cache until TTL expires
- Result: Fewer tokens used for accessing the same information
Automatic Cache Management
The server automatically manages the caching process by:
- Storing data when first encountered
- Serving cached data when available
- Removing old/unused data based on settings
- Tracking effectiveness through statistics
Optimization Tips
1. Set Appropriate TTLs
- Shorter for frequently changing data
- Longer for static content
2. Adjust Memory Limits
- Higher for more caching (more token savings)
- Lower if memory usage is a concern
3. Monitor Cache Stats
- High hit rate = good token savings
- Low hit rate = adjust TTL or limits
Environment Variable Configuration
You can override config.json settings using environment variables in your MCP settings:
{
"mcpServers": {
"memory-cache": {
"command": "node",
"args": ["/path/to/build/index.js"],
"env": {
"MAX_ENTRIES": "5000",
"MAX_MEMORY": "209715200", // 200MB
"DEFAULT_TTL": "7200", // 2 hours
"CHECK_INTERVAL": "120000", // 2 minutes
"STATS_INTERVAL": "60000" // 1 minute
}
}
}
}
You can also specify a custom config file location:
{
"env": {
"CONFIG_PATH": "/path/to/your/config.json"
}
}
The server will:
- Look for config.json in its directory
- Apply any environment variable overrides
- Use default values if neither is specified
Testing the Cache in Practice
To see the cache in action, try these scenarios:
-
File Reading Test
- Read and analyze a large file
- Ask the same question about the file again
- The second response should be faster as the file content is cached
-
Data Analysis Test
- Perform analysis on some data
- Request the same analysis again
- The second analysis should use cached results
-
Project Navigation Test
- Explore a project's structure
- Query the same files/directories again
- Directory listings and file contents will be served from cache
The cache is working when you notice:
- Faster responses for repeated operations
- Consistent answers about unchanged content
- No need to re-read files that haven't changed
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.