SecureLLM MCP Server

SecureLLM MCP Server

Enables AI assistants to interact with NixOS development tools, manage builds, and optimize workflows through natural language.

Category
Visit Server

README

SecureLLM MCP Server

Enterprise-Grade Model Context Protocol Server for Intelligent Development Workflows

TypeScript NixOS License Production Ready


Overview

SecureLLM MCP is a production-ready Model Context Protocol (MCP) server that transforms AI assistants into intelligent development partners. Built with enterprise-grade architecture, it combines advanced caching, reasoning systems, and comprehensive tooling to deliver unprecedented productivity for NixOS and systems programming workflows.

Key Capabilities

  • Semantic Intelligence: 50-70% cost reduction through embedding-based query caching
  • Hybrid Reasoning: Context inference, multi-step planning, and causal impact analysis
  • Production-Ready: Circuit breakers, retry logic, structured logging, and Prometheus metrics
  • NixOS First-Class: Deep integration with Nix ecosystem - package debugging, flake management, build optimization
  • Emergency Framework: Laptop thermal protection during intensive builds
  • Knowledge Management: Persistent learning with SQLite + FTS5 full-text search
  • Security-Focused: SOPS secrets management, OAuth integration, sandboxed execution

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                         MCP CLIENT (Claude, Cline)                  │
└────────────────────────────┬────────────────────────────────────────┘
                             │ stdio/HTTP
                             ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    SecureLLM MCP Server Core                         │
│  ┌────────────────┐  ┌────────────────┐  ┌────────────────┐        │
│  │  Semantic      │  │  Smart Rate    │  │  Knowledge     │        │
│  │  Cache         │  │  Limiter       │  │  Database      │        │
│  │  (Embeddings)  │  │  (Circuit      │  │  (SQLite +     │        │
│  │                │  │   Breaker)     │  │   FTS5)        │        │
│  └────────────────┘  └────────────────┘  └────────────────┘        │
└─────────────────────────────────────────────────────────────────────┘
                             │
        ┌────────────────────┼────────────────────┐
        ▼                    ▼                    ▼
┌──────────────┐  ┌──────────────────┐  ┌──────────────────┐
│  Reasoning   │  │  Development     │  │  Infrastructure  │
│  Systems     │  │  Tools           │  │  Management      │
│              │  │                  │  │                  │
│ • Context    │  │ • Nix Package    │  │ • SSH Remote     │
│   Inference  │  │   Debugger       │  │   Execution      │
│ • Multi-Step │  │ • Build Analyzer │  │ • System Health  │
│   Planner    │  │ • Flake Ops      │  │   Monitoring     │
│ • Causal     │  │ • Web Search     │  │ • Emergency      │
│   Analysis   │  │ • Browser Auto   │  │   Framework      │
│ • Adaptive   │  │ • Research Agent │  │ • Backup Manager │
│   Learning   │  │ • Code Analysis  │  │ • Log Analysis   │
└──────────────┘  └──────────────────┘  └──────────────────┘
                             │
                             ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    Observability & Security                          │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐  ┌────────────┐   │
│  │ Prometheus │  │ Structured │  │ OAuth/     │  │ Sandboxed  │   │
│  │ Metrics    │  │ Logging    │  │ GitHub     │  │ Execution  │   │
│  └────────────┘  └────────────┘  └────────────┘  └────────────┘   │
└─────────────────────────────────────────────────────────────────────┘

Features

🧠 Intelligent Caching Layer

Semantic Cache - Industry-first embedding-based caching for MCP servers:

  • Semantic Similarity Detection: Understands that "check system temperature" and "verify thermal status" are equivalent queries
  • Cost Optimization: 50-70% reduction in tool execution costs
  • Automatic Expiration: TTL-based cache invalidation with periodic cleanup
  • Performance Metrics: Real-time hit/miss rates, token savings, similarity scores
// Queries like these hit the same cache:
"What's the current CPU temperature?"
"Check thermal status of the system"
"Show me processor heat levels"

🎯 Smart Rate Limiting

Production-grade request management with circuit breaker pattern:

  • Per-Provider Queuing: FIFO request queues with configurable limits
  • Circuit Breaker: Automatic failure detection and recovery
  • Exponential Backoff: Intelligent retry with jitter
  • Metrics Collection: Request latency percentiles (p50, p95, p99), error categorization, queue depths
  • Prometheus Export: HTTP metrics endpoint for observability

🗄️ Knowledge Management System

Persistent learning infrastructure with advanced search:

  • SQLite + FTS5: Full-text search with Porter stemming and Unicode support
  • Session Management: Contextual conversation tracking across interactions
  • Structured Storage: Typed entries (insights, decisions, code, references)
  • Priority System: High/medium/low classification for relevance ranking
  • Project Watcher: Automatic file system monitoring and knowledge extraction

🔧 NixOS Development Tools

Comprehensive tooling for NixOS ecosystem:

  • Package Debugger: Diagnose and fix Nix package build failures
  • Flake Operations: Build, update, and manage Nix flakes
  • Build Analyzer: Performance profiling and optimization recommendations
  • Hash Calculator: Automatic SHA256 calculation for fetchurl/fetchFromGitHub
  • Configuration Generator: Smart Nix expression generation

🛡️ Emergency Framework

Laptop protection during intensive operations:

  • Thermal Monitoring: Real-time CPU/GPU temperature tracking
  • Rebuild Safety Checks: Pre-build thermal validation
  • Automatic Throttling: Force cooldown when temperature exceeds thresholds
  • Forensic Analysis: Post-build thermal profiling with detailed reports
  • War Room Mode: Live monitoring during critical operations

🔍 Hybrid Reasoning (Beta)

Next-generation AI capabilities currently in development:

  • Context Inference Engine: Automatic entity extraction from user input and project state
  • Proactive Action Engine: Execute preparatory checks before asking questions
  • Multi-Step Planner: Decompose complex tasks into dependency-ordered steps
  • Causal Reasoning: Predict change impacts through dependency graph analysis
  • Adaptive Learning: Continuous improvement from interaction feedback

Installation

Prerequisites

  • Node.js: 22.0+ (native ESM support)
  • NixOS: Recommended for full feature set
  • SQLite: 3.35+ (for FTS5 support)
  • Optional: llama.cpp server for semantic caching embeddings

Quick Start

# Clone repository
git clone https://github.com/marcosfpina/securellm-mcp.git
cd securellm-mcp

# Install dependencies
npm install

# Build
npm run build

# Run server
node build/src/index.js

Environment Configuration

Create .env file:

# Core Configuration
PROJECT_ROOT=/path/to/your/project
ENABLE_KNOWLEDGE=true
KNOWLEDGE_DB_PATH=~/.local/share/securellm/knowledge.db

# Semantic Cache (Optional)
ENABLE_SEMANTIC_CACHE=true
SEMANTIC_CACHE_THRESHOLD=0.85
SEMANTIC_CACHE_TTL=3600
LLAMA_CPP_URL=http://localhost:8080

# API Keys (loaded via SOPS in production)
ANTHROPIC_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
DEEPSEEK_API_KEY=your_key_here

# Observability
METRICS_PORT=9090
LOG_LEVEL=info

MCP Client Integration

Claude Desktop

// ~/.config/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "securellm": {
      "command": "node",
      "args": ["/path/to/securellm-mcp/build/src/index.js"],
      "env": {
        "PROJECT_ROOT": "/your/project/path"
      }
    }
  }
}

Cline (VSCodium/VSCode)

// ~/.config/VSCodium/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json
{
  "mcpServers": {
    "securellm": {
      "command": "node",
      "args": ["/path/to/securellm-mcp/build/src/index.js"],
      "env": {
        "PROJECT_ROOT": "${workspaceFolder}"
      }
    }
  }
}

Usage Examples

Package Debugging

// Diagnose why a Nix package won't build
await mcp.call("package_diagnose", {
  package_path: "./pkgs/custom-app/default.nix",
  package_type: "js",
  build_test: true
});

// Download package from GitHub with automatic hash calculation
await mcp.call("package_download", {
  package_name: "awesome-tool",
  package_type: "tar",
  source: {
    type: "github_release",
    github: {
      repo: "owner/awesome-tool",
      tag: "v1.2.3",
      asset_pattern: "*.tar.gz"
    }
  }
});

Emergency Framework

// Check if it's safe to rebuild
await mcp.call("rebuild_safety_check");

// Monitor thermals during build
await mcp.call("thermal_warroom", {
  duration: 120  // Monitor for 2 minutes
});

// Get forensic analysis after thermal event
await mcp.call("thermal_forensics", {
  duration: 180,
  skip_rebuild: false
});

Knowledge Management

// Create development session
const session = await mcp.call("create_session", {
  summary: "Implementing new authentication module"
});

// Save insights during development
await mcp.call("save_knowledge", {
  session_id: session.id,
  entry_type: "decision",
  content: "Using JWT tokens instead of sessions for API auth",
  tags: ["auth", "api", "jwt"],
  priority: "high"
});

// Search past decisions
const results = await mcp.call("search_knowledge", {
  query: "authentication jwt",
  entry_type: "decision",
  limit: 5
});

System Health Monitoring

// Comprehensive health check
await mcp.call("system_health_check", {
  detailed: true
});

// Analyze system logs
await mcp.call("system_log_analyzer", {
  service: "sshd",
  since: "1 hour ago",
  level: "error"
});

// Service management
await mcp.call("system_service_manager", {
  action: "restart",
  service: "nginx"
});

Research & Analysis

// Deep research on technical topics
await mcp.call("research_agent", {
  topic: "Rust async runtime comparison",
  depth: "comprehensive",
  sources: ["github", "reddit", "documentation"]
});

// Analyze codebase complexity
await mcp.call("analyze_complexity", {
  directory: "./src",
  include_patterns: ["**/*.ts"],
  metrics: ["cyclomatic", "cognitive", "maintainability"]
});

// Find potentially dead code
await mcp.call("find_dead_code", {
  directory: "./src",
  extensions: [".ts", ".js"]
});

Resources

The server exposes several MCP resources for querying system state:

  • config://current - Current SecureLLM configuration
  • logs://audit - Recent audit log entries
  • metrics://usage - Provider usage statistics
  • metrics://prometheus - Prometheus-format metrics
  • metrics://semantic-cache - Cache performance stats
  • docs://api - API documentation
// Query cache performance
const stats = await mcp.read("metrics://semantic-cache");
console.log(`Hit rate: ${stats.hitRate}%`);
console.log(`Tokens saved: ${stats.tokensSaved}`);

Performance

Benchmarks

  • Semantic Cache Lookup: < 10ms (in-memory embedding comparison)
  • Knowledge DB Search: < 50ms (FTS5 indexed queries)
  • Rate Limiter Overhead: < 5ms per request
  • Circuit Breaker Decision: < 1ms

Scalability

  • Memory Footprint: ~512MB base + 256MB per active reasoning session
  • Database Size: ~100MB per 10,000 knowledge entries
  • Concurrent Requests: 100+ simultaneous tool calls (per-provider queuing)
  • Cache Storage: ~1KB per cached response

Security

Secrets Management

  • SOPS Integration: Encrypted secrets stored in secrets.yaml
  • Environment Variables: Runtime API key injection
  • No Hardcoded Credentials: All sensitive data externalized

Sandboxed Execution

  • Tool Whitelisting: Configurable allowed commands
  • Path Restrictions: Sandboxed file system access
  • Network Isolation: Optional network policy enforcement

Audit Trail

  • Structured Logging: All actions logged with context
  • Knowledge DB Audit: Complete interaction history
  • Metrics Retention: 30-day historical performance data

Development

Project Structure

securellm-mcp/
├── src/
│   ├── index.ts                    # MCP server entry point
│   ├── knowledge/
│   │   └── database.ts             # SQLite + FTS5 implementation
│   ├── middleware/
│   │   ├── semantic-cache.ts       # Embedding-based caching
│   │   ├── rate-limiter.ts         # Smart rate limiting
│   │   ├── circuit-breaker.ts      # Failure detection
│   │   ├── retry-strategy.ts       # Exponential backoff
│   │   └── metrics-collector.ts    # Performance tracking
│   ├── reasoning/
│   │   ├── context-manager.ts      # Context inference
│   │   ├── multi-step-planner.ts   # Task decomposition
│   │   └── proactive-executor.ts   # Pre-action execution
│   ├── tools/
│   │   ├── package-diagnose.ts     # Nix package debugging
│   │   ├── emergency/              # Thermal protection
│   │   ├── laptop-defense/         # System safety
│   │   ├── system/                 # Health monitoring
│   │   ├── ssh/                    # Remote execution
│   │   ├── browser/                # Web automation
│   │   └── nix/                    # Nix ecosystem tools
│   ├── types/
│   │   ├── knowledge.ts            # Knowledge DB schemas
│   │   ├── semantic-cache.ts       # Cache type definitions
│   │   └── middleware/             # Middleware types
│   └── utils/
│       ├── logger.ts               # Pino structured logging
│       ├── project-detection.ts    # Auto project root detection
│       └── host-detection.ts       # NixOS hostname resolution
├── docs/                           # Architecture documentation
├── tests/                          # Integration tests
└── build/                          # Compiled output

Building from Source

# Development mode with watch
npm run watch

# Production build
npm run build

# Run tests
npm test

# Type checking
npx tsc --noEmit

Contributing

  1. Architecture Changes: Review docs/HYBRID-REASONING-ARCHITECTURE.md
  2. Code Style: Follow existing TypeScript patterns, use Zod for validation
  3. Testing: Add integration tests for new tools
  4. Documentation: Update README and inline JSDoc comments

Roadmap

Phase 1: Core Infrastructure ✅

  • [x] MCP server implementation
  • [x] Knowledge database (SQLite + FTS5)
  • [x] Smart rate limiter with circuit breaker
  • [x] Semantic cache with embeddings
  • [x] Nix package debugging tools
  • [x] Emergency framework
  • [x] Prometheus metrics

Phase 2: Reasoning Systems 🚧

  • [x] Context inference engine
  • [x] Proactive action executor
  • [x] Multi-step task planner
  • [ ] Causal dependency analyzer
  • [ ] Adaptive learning system

Phase 3: Advanced Tools 🚧

  • [x] SSH remote execution suite
  • [ ] Browser automation tools
  • [ ] Sensitive data handling
  • [ ] File organization system
  • [ ] Advanced code analysis

Phase 4: Enterprise Features

  • [ ] Multi-user support
  • [ ] Role-based access control
  • [ ] Distributed caching
  • [ ] Horizontal scaling
  • [ ] SaaS deployment

Monitoring & Observability

Prometheus Metrics

Expose metrics on HTTP endpoint:

# Start metrics server
export METRICS_PORT=9090
node build/src/index.js

# Query metrics
curl http://localhost:9090/metrics

Available metrics:

  • mcp_rate_limiter_requests_total{provider="deepseek"}
  • mcp_rate_limiter_request_duration_seconds{provider="openai"}
  • mcp_circuit_breaker_state{provider="anthropic"}
  • mcp_semantic_cache_hits_total
  • mcp_semantic_cache_tokens_saved_total

Structured Logging

Pino-based JSON logging:

{
  "level": "info",
  "time": 1704196800000,
  "msg": "Semantic cache hit",
  "similarity": 0.92,
  "toolName": "thermal_check",
  "tokensSaved": 150
}

Troubleshooting

Common Issues

1. Semantic cache not working

# Verify llama.cpp server is running
curl http://localhost:8080/health

# Check cache database exists
ls -lh ~/.local/share/securellm/semantic_cache.db

# Enable debug logging
export LOG_LEVEL=debug

2. Rate limiter throttling requests

# Check current queue status
# (use rate_limiter_status tool via MCP)

# Adjust rate limits in config
# See src/config/rate-limits.ts

3. Knowledge DB corruption

# Backup and rebuild
cp ~/.local/share/securellm/knowledge.db{,.backup}
rm ~/.local/share/securellm/knowledge.db
# Restart server (will recreate schema)

License

MIT License - See LICENSE file


Acknowledgments

Built with:

Inspired by:

  • NixOS community's declarative infrastructure philosophy
  • The MCP ecosystem's vision for AI-native tooling
  • Production systems engineering best practices

Contact

Author: marcosfpina Project: github.com/marcosfpina/securellm-mcp Issues: GitHub Issues


Built for developers who demand production-grade tooling.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured