SlimContext MCP Server
Provides AI chat history compression tools through token-based trimming and AI-powered summarization strategies to manage conversation context within token limits.
README
SlimContext MCP Server
A Model Context Protocol (MCP) server that wraps the SlimContext library, providing AI chat history compression tools for MCP-compatible clients.
Overview
SlimContext MCP Server exposes two powerful compression strategies as MCP tools:
trim_messages- Token-based compression that removes oldest messages when exceeding token thresholdssummarize_messages- AI-powered compression using OpenAI to create concise summaries
Installation
npm install -g slimcontext-mcp-server
# or
pnpm add -g slimcontext-mcp-server
Development
# Clone and setup
git clone <repository>
cd slimcontext-mcp-server
pnpm install
# Build
pnpm build
# Run in development
pnpm dev
# Type checking
pnpm typecheck
Configuration
MCP Client Setup
Add to your MCP client configuration:
{
"mcpServers": {
"slimcontext": {
"command": "npx",
"args": ["-y", "slimcontext-mcp-server"]
}
}
}
Environment Variables
OPENAI_API_KEY: OpenAI API key for summarization (optional, can be passed as tool parameter)
Tools
trim_messages
Compresses chat history using token-based trimming strategy.
Parameters:
messages(required): Array of chat messagesmaxModelTokens(optional): Maximum model token context window (default: 8192)thresholdPercent(optional): Percentage threshold to trigger compression 0-1 (default: 0.7)minRecentMessages(optional): Minimum recent messages to preserve (default: 2)
Example:
{
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Hello!" },
{ "role": "assistant", "content": "Hi there! How can I help you today?" },
{ "role": "user", "content": "Tell me about AI." }
],
"maxModelTokens": 4000,
"thresholdPercent": 0.8,
"minRecentMessages": 2
}
Response:
{
"success": true,
"original_message_count": 4,
"compressed_message_count": 3,
"messages_removed": 1,
"compression_ratio": 0.75,
"compressed_messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "assistant", "content": "Hi there! How can I help you today?" },
{ "role": "user", "content": "Tell me about AI." }
]
}
summarize_messages
Compresses chat history using AI-powered summarization strategy.
Parameters:
messages(required): Array of chat messagesmaxModelTokens(optional): Maximum model token context window (default: 8192)thresholdPercent(optional): Percentage threshold to trigger compression 0-1 (default: 0.7)minRecentMessages(optional): Minimum recent messages to preserve (default: 4)openaiApiKey(optional): OpenAI API key (can also use OPENAI_API_KEY env var)openaiModel(optional): OpenAI model for summarization (default: 'gpt-4o-mini')customPrompt(optional): Custom summarization prompt
Example:
{
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "I want to build a web scraper." },
{
"role": "assistant",
"content": "I can help you build a web scraper! What programming language would you prefer?"
},
{ "role": "user", "content": "Python please." },
{
"role": "assistant",
"content": "Great choice! For Python web scraping, I recommend using requests and BeautifulSoup..."
},
{ "role": "user", "content": "Can you show me a simple example?" }
],
"maxModelTokens": 4000,
"thresholdPercent": 0.6,
"minRecentMessages": 2,
"openaiModel": "gpt-4o-mini"
}
Response:
{
"success": true,
"original_message_count": 6,
"compressed_message_count": 4,
"messages_removed": 2,
"summary_generated": true,
"compression_ratio": 0.67,
"compressed_messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{
"role": "system",
"content": "The user expressed interest in building a web scraper and requested help with Python. The assistant recommended using requests and BeautifulSoup libraries for Python web scraping."
},
{
"role": "assistant",
"content": "Great choice! For Python web scraping, I recommend using requests and BeautifulSoup..."
},
{ "role": "user", "content": "Can you show me a simple example?" }
]
}
Message Format
Both tools expect messages in SlimContext format:
interface SlimContextMessage {
role: 'system' | 'user' | 'assistant' | 'tool' | 'human';
content: string;
}
Error Handling
All tools return structured error responses:
{
"success": false,
"error": "Error message description",
"error_type": "SlimContextError" | "OpenAIError" | "UnknownError"
}
Common error scenarios:
- Missing OpenAI API key for summarization
- Invalid message format
- OpenAI API rate limits or errors
- Invalid parameter values
Token Estimation
SlimContext uses a simple heuristic for token estimation: Math.ceil(content.length / 4) + 2. This provides a reasonable approximation for most use cases. For more accurate token counting, you would need to implement a custom token estimator in your client application.
Compression Strategies
Trimming Strategy
- Preserves all system messages
- Preserves the most recent N messages
- Removes oldest non-system messages until under token threshold
- Fast and deterministic
- No external API dependencies
Summarization Strategy
- Preserves all system messages
- Preserves the most recent N messages
- Summarizes middle portion of conversation using AI
- Creates contextually rich summaries
- Requires OpenAI API access
License
MIT
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Submit a pull request
Related
- SlimContext - The underlying compression library
- Model Context Protocol - The protocol specification
- MCP SDK - TypeScript SDK for MCP
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
E2B
Using MCP to run code via e2b.