MCP Servers

MCP Local LLM Server

Processes large files, error logs, and codebase searches locally via Ollama to reduce token consumption in Cursor by up to 80%.

README

MCP Local LLM Server - Context Compressor

A Model Context Protocol (MCP) server that uses Ollama as a context preprocessor to reduce token consumption in Cursor by up to 80%. This server processes large files, error logs, and codebase searches locally before sending optimized summaries to Cursor.

Features

🤖 Ollama Integration: Uses Ollama for local LLM processing
📉 Token Reduction: Reduces token usage by processing context locally
🔧 Multiple Tools: Context compression, code analysis, log processing, and semantic search
📝 MCP Prompts: Dynamic instruction injection for Cursor
📦 MCP Resources: Read-only data resources (config, models, tools, prompts, stats)
⚙️ Configurable: Customizable model, temperature, and token limits
🚀 Easy Setup: Simple installation and configuration
📡 MCP Compatible: Works with any MCP-compatible client
🔒 Privacy: Processes sensitive data locally, only sends summaries to Cursor
✅ Full MCP Support: Implements all core MCP capabilities (Tools, Prompts, Resources)

Prerequisites

Node.js (version 18 or higher)
Ollama installed and running
- Download from: https://ollama.ai/
- Install and start Ollama service
- Pull at least one model: ollama pull llama3

Installation

Clone or download this repository
Install dependencies:
```
npm install
```

Configuration

The server can be configured using environment variables in the MCP client configuration file.

Provider Selection

Set LLM_PROVIDER to choose which LLM provider to use:

# Select provider: 'ollama', 'openai', 'anthropic', or 'gemini'
export LLM_PROVIDER=ollama

Provider-Specific Configuration

Ollama (default):

export LLM_PROVIDER=ollama
export OLLAMA_URL=http://localhost:11434
export MODEL_NAME=llama3

OpenAI:

export LLM_PROVIDER=openai
export OPENAI_API_KEY=sk-your-api-key
export MODEL_NAME=gpt-3.5-turbo

Anthropic:

export LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-your-api-key
export MODEL_NAME=claude-3-haiku-20240307

Gemini:

export LLM_PROVIDER=gemini
export GEMINI_API_KEY=your-api-key
export MODEL_NAME=gemini-1.5-flash

Common Settings

# Maximum tokens in response (default: 256)
export MAX_TOKENS=256

# Temperature for response generation (default: 0.7)
export TEMPERATURE=0.7

See MCP_CONFIGURATION.md for detailed configuration examples.

Usage

Starting the Server

# Start the MCP server
npm start

# Or for development with auto-restart
npm run dev

Available Tools

The server provides multiple tools for context compression and code analysis:

1. `analyze_huge_file`

Analyzes large files locally and returns a structured summary with architecture, global variables, entry points, and main logic. Reduces token usage by processing files locally before sending to Cursor.

Parameters:

path (required): Path to the file to analyze

Example:

{
  "name": "analyze_huge_file",
  "arguments": {
    "path": "/path/to/large-file.js"
  }
}

Returns: JSON with architecture, global_variables, entry_points, main_logic, and original_size

2. `digest_error_logs`

Processes error logs locally to identify patterns, remove repetitive timestamps, and group similar errors. Returns a structured summary with probable cause and statistics.

Parameters:

log_file_path (optional): Path to the log file
terminal_output (optional): Direct terminal output content

Example:

{
  "name": "digest_error_logs",
  "arguments": {
    "log_file_path": "/path/to/error.log"
  }
}

Returns: JSON with probable_cause, occurrences, period, error_types, and recommendation

3. `codebase_discovery`

Performs semantic search in the codebase to find files and specific lines where related logic is implemented. Uses local processing to reduce token usage.

Parameters:

query (required): Semantic query about the code (e.g., "where is payment processed?")
root_path (optional): Root directory path to search in (default: current directory)

Example:

{
  "name": "codebase_discovery",
  "arguments": {
    "query": "where is payment processed?",
    "root_path": "/path/to/project"
  }
}

Returns: JSON with files (array of file references with line numbers), total_occurrences, and summary

4. `ask_llm`

Ask a question to the AI model running via Ollama and get a response.

Parameters:

question (required): The question or prompt to send to the AI model

Example:

{
  "name": "ask_llm",
  "arguments": {
    "question": "What is the capital of France?"
  }
}

5. `check_llm_status`

Check if Ollama is running and accessible.

Example:

{
  "name": "check_llm_status",
  "arguments": {}
}

6. `think_through`

Adds an extra thinking layer by analyzing tasks, considering multiple approaches, and providing structured reasoning before execution.

Parameters:

task (required): The task, question, or problem to think through
context (optional): Additional context about the situation
focus_areas (optional): Specific areas to focus on (e.g., ["security", "performance"])
output_format (optional): Format of output - "plan", "analysis", "considerations", or "structured" (default)

Example:

{
  "name": "think_through",
  "arguments": {
    "task": "Refactor authentication to use JWT",
    "context": "Current: session-based, Node.js/Express",
    "focus_areas": ["security", "maintainability"],
    "output_format": "structured"
  }
}

Available Prompts

The server provides MCP prompts that inject dynamic instructions into Cursor:

mcp_tool_usage_rules: Mandatory rules for using MCP tools instead of direct actions
token_economy_guidelines: Guidelines for maximizing token savings
thinking_layer_instructions: Instructions for using the thinking layer
context_compression_rules: Rules for using context compression tools
chat_end_summary_rule: Automatically stores chat summaries using memory_store tool (can be disabled via DISABLE_CHAT_SUMMARY_RULE)

Note: The chat_end_summary_rule prompt is automatically available to all projects using this MCP server. To disable it, set the environment variable DISABLE_CHAT_SUMMARY_RULE=true in your MCP configuration.

Available Resources

The server exposes read-only resources via MCP:

mcp://local-llm/config: Current server configuration
mcp://local-llm/models: List of available Ollama models
mcp://local-llm/tools: List of all available tools
mcp://local-llm/prompts: List of all available prompts
mcp://local-llm/usage_stats: Usage statistics and token savings info

Example:

{
  "method": "resources/read",
  "params": {
    "uri": "mcp://local-llm/config"
  }
}

MCP Client Integration

To use this server with an MCP client (like Cursor), add it to your client configuration.

Basic Configuration (Ollama)

{
  "mcpServers": {
    "local-llm": {
      "command": "node",
      "args": ["path/to/your/mcp-local-llm/src/index.js"],
      "env": {
        "LLM_PROVIDER": "ollama",
        "OLLAMA_URL": "http://localhost:11434",
        "MODEL_NAME": "llama3"
      }
    }
  }
}

Using Different Providers

The server supports multiple LLM providers. Set LLM_PROVIDER to switch:

OpenAI:

{
  "env": {
    "LLM_PROVIDER": "openai",
    "OPENAI_API_KEY": "sk-your-key",
    "MODEL_NAME": "gpt-3.5-turbo"
  }
}

Anthropic:

{
  "env": {
    "LLM_PROVIDER": "anthropic",
    "ANTHROPIC_API_KEY": "sk-ant-your-key",
    "MODEL_NAME": "claude-3-haiku-20240307"
  }
}

Gemini:

{
  "env": {
    "LLM_PROVIDER": "gemini",
    "GEMINI_API_KEY": "your-key",
    "MODEL_NAME": "gemini-1.5-flash"
  }
}

See MCP_CONFIGURATION.md for complete configuration guide.

Troubleshooting

Common Issues

Connection Refused Error
- Make sure Ollama is running: ollama serve or check if the service is running
- Verify Ollama is accessible at http://localhost:11434
- Check if Ollama is installed: ollama --version
No Models Available
- Pull a model: ollama pull llama3
- Check available models: ollama list
- Recommended models: llama3, deepseek-coder, codellama, mistral
Timeout Errors
- Large files may take time to process (max 15 seconds)
- Consider using smaller models for faster responses
- Check Ollama resource allocation
Tool Errors
- Tools return generic error messages (identity hiding)
- Check server logs for detailed error information
- Verify file paths are correct and accessible

Testing the Server

You can test the server manually by sending MCP requests:

# Test checking Ollama status
echo '{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "check_llm_status", "arguments": {}}}' | node src/index.js

# Test asking a question
echo '{"jsonrpc": "2.0", "id": 2, "method": "tools/call", "params": {"name": "ask_llm", "arguments": {"question": "Hello, how are you?"}}}' | node src/index.js

# Test analyzing a file
echo '{"jsonrpc": "2.0", "id": 3, "method": "tools/call", "params": {"name": "analyze_huge_file", "arguments": {"path": "src/index.js"}}}' | node src/index.js

Development

Project Structure

mcp-local-llm/
├── src/
│   ├── index.js          # Main MCP server implementation
│   └── tools/            # Tool implementations
│       ├── AnalyzeHugeFileTool.js
│       ├── DigestErrorLogsTool.js
│       ├── CodebaseDiscoveryTool.js
│       └── ... (other tools)
├── package.json          # Dependencies and scripts
└── README.md            # This file

Adding New Tools

To add new tools:

Create a new tool class extending BaseTool in src/tools/
Implement getToolDefinition() and handle() methods
Add the tool to src/tools/index.js exports and ALL_TOOLS array
The tool will be automatically registered with the MCP server

License

MIT License - feel free to use and modify as needed.

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

MCP Local LLM Server

README

MCP Local LLM Server - Context Compressor

Features

Prerequisites

Installation

Configuration

Provider Selection

Provider-Specific Configuration

Common Settings

Usage

Starting the Server

Available Tools

1. analyze_huge_file

2. digest_error_logs

3. codebase_discovery

4. ask_llm

5. check_llm_status

6. think_through

Available Prompts

Available Resources

MCP Client Integration

Basic Configuration (Ollama)

Using Different Providers

Troubleshooting

Common Issues

Testing the Server

Development

Project Structure

Adding New Tools

License

Contributing

Recommended Servers

1. `analyze_huge_file`

2. `digest_error_logs`

3. `codebase_discovery`

4. `ask_llm`

5. `check_llm_status`

6. `think_through`