MCP Servers

SIFT Kernel

MCP server that enforces forensic methodology and prevents hallucinations by requiring evidence links and deterministic verification, exposing 129 forensic operations across 14 categories with auto-correlation and reasoning engine.

README

SIFT Kernel

Forensic Evidence Operating System — An MCP server that makes evidence spoliation architecturally impossible, hallucinations structurally unregisterable, and forensic methodology computationally enforced.

Built for the FIND EVIL! Hackathon by SANS Institute.

Results

Metric	Value
Forensic operations	129 across 14 categories (+ 21 reporting/kernel tools)
Reasoning engine (FARE)	DSmT/PCR5 evidence fusion + Active Inference (EFE) tool selection + Rough-set confidence tiers
Knowledge base	90 evidence-to-hypothesis rules across 7 artifact categories, 12 attack scenarios
Auto-correlation	Temporal proximity + MITRE kill-chain sequencing + shared entity detection
Formal security properties	6 (property-tested, 10,000 scenarios each)
Context overload solved	32 tools exposed (14 category dispatchers + 18 kernel) — not 128 flat
Hallucination rate	0% by construction (findings require evidence links + deterministic verification)
Methodology enforcement	Signal-driven reactive FSM (7 states) + capability DAG
Self-correction	External-feedback architecture (arXiv:2601.00828) — FARE conflict detection, entropy tracking, bias monitoring
Inference constraint	Level 3-4 per Hilgert et al. 2025 (arXiv:2506.00274) — server handles parsing + anomaly detection
Report formats	Markdown, JSON, interactive HTML (entropy curve SVG, dark/light toggle, correlation timeline)
Transports	stdio + HTTP/SSE (fail-closed bearer auth)
Tested against	Real 16GB SRL-2018 E01 disk image (APT compromise)

SUBMISSION COMPLIANCE

Judges: Every turn-in requirement maps to a specific file. Nothing is hidden.

#	Requirement	Location	Status
1	Code repository (public, open source)	This repository	Done
2	Open source license (MIT or Apache 2.0)	LICENSE	Done (MIT)
3	README with setup instructions	README.md — see "Quick Start" below	Done
4	Live deployment URL or local run instructions	README.md — see "Quick Start" below	Done (local)
5	Text description of features/functionality	README.md — see "What It Does" + "How It Works" below	Done
6	Demo video (< 5 min, live terminal, audio, self-correction)	YouTube + docs/DEMO.md	Done
7	Architecture diagram	docs/architecture.md	Done
8	Evidence dataset documentation	docs/DATASET.md	Done
9	Accuracy report	docs/ACCURACY-REPORT.md	Done
10	Agent execution logs	docs/EXECUTION-LOGS.md + `sift-output/ledger.db`	Done

Additional Documentation

Document	Description
CLAUDE.md	Agent skill file — teaches any LLM the investigation loop
docs/ZERO-TRUST-ALIGNMENT.md	How architecture maps to Anthropic's Zero Trust for AI Agents (2026)
docs/USER-GUIDE.md	Practitioner guide for forensic analysts
docs/DESIGN-DECISIONS.md	Architectural choices and rationale
docs/BYPASS-TESTING.md	Security bypass attempts and why they fail

What It Does

SIFT Kernel sits between any AI agent (Claude, GPT, Llama, etc.) and the SANS SIFT Workstation's 200+ forensic tools. It provides:

129 forensic operations across 14 categories + 21 reporting tools, exposed via the Model Context Protocol (MCP)
Category dispatcher architecture — agent sees 32 tools (14 forensic categories + 18 kernel), each category groups related operations
Capability-based security — no shell access exists, read-only evidence mounts
Hash-chained evidence ledger — every tool call cryptographically recorded
Confidence scoring — findings require evidence links (hallucinations structurally blocked)
FARE reasoning engine — Dempster-Shafer evidence fusion (PCR5), Active Inference (EFE) tool selection, Rough-set confidence tiers, convergence detection, cognitive bias monitoring
Auto-correlation — deterministic attack chain detection via temporal proximity + MITRE kill-chain sequencing + shared entity linking
Self-correction architecture — coverage gap detection, FARE conflict interrupts (K>0.3), entropy-plateau falsification injection, evidence provenance verification
Forensic knowledge enrichment — every tool response includes expert caveats, corroboration suggestions, and interpretation guidance at the response level (not system prompt)
Determinism tracking — measures how closely the agent follows server methodology recommendations (0.0-1.0 score)

Architecture

┌─────────────────────────────────────────────────────────────┐
│  ANY MCP CLIENT (Claude Code, Cursor, Zed, etc.)            │
└────────────────────────────┬────────────────────────────────┘
                             │ MCP Protocol (stdio)
┌────────────────────────────▼────────────────────────────────┐
│  SIFT KERNEL                                                 │
│  ┌────────────────────────────────────────────────────────┐ │
│  │ Category Dispatchers (14 forensic + 18 kernel = 32)   │ │
│  ├────────────────────────────────────────────────────────┤ │
│  │ Meta-Cognitive Toolkit (self-correction engine)         │ │
│  ├────────────────────────────────────────────────────────┤ │
│  │ Evidence Ledger (hash-chained, append-only)            │ │
│  ├────────────────────────────────────────────────────────┤ │
│  │ Output Intelligence (anomaly detection)                │ │
│  ├────────────────────────────────────────────────────────┤ │
│  │ Capability Kernel (DAG prerequisites, ro mounts)       │ │
│  ├────────────────────────────────────────────────────────┤ │
│  │ Tool Executors (binary allowlist, shell:false)         │ │
│  └────────────────────────────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────┘
                             │ subprocess (shell:false)
┌────────────────────────────▼────────────────────────────────┐
│  SIFT WORKSTATION (200+ forensic binaries)                   │
└─────────────────────────────────────────────────────────────┘

Quick Start

Prerequisites

Node.js 18+
SIFT Workstation (for actual forensic tool execution)

Install

git clone https://github.com/Sathvik-1007/sift-kernel.git
cd sift-kernel
npm install

Run

# As MCP server (stdio transport — for Claude Code, OpenCode, etc.)
npx tsx src/index.ts --output ./investigation

# With evidence pre-loaded (optional — can also load dynamically via mount_evidence tool)
npx tsx src/index.ts --evidence /path/to/image.E01 --output ./investigation

# HTTP transport with bearer auth (for remote/multi-client access)
npx tsx src/index.ts --transport http --port 3000 --token YOUR_SECRET

# Fresh investigation (wipes prior state)
npx tsx src/index.ts --fresh --output ./investigation

Connect from MCP Client

Add to your MCP client configuration:

{
  "mcpServers": {
    "sift-kernel": {
      "command": "npx",
      "args": ["tsx", "/path/to/sift-kernel/src/index.ts", "--output", "/tmp/sift-output", "--fresh"],
      "env": {}
    }
  }
}

How It Works

The "Even Dumb LLMs Work" Loop

while (not done):
    result = call("suggest_next_action")
    call(result.tool_name, result.params)

Any model that can parse JSON and call a function produces a valid, methodology-correct, auditable forensic investigation. The intelligence is in the server.

Progressive Disclosure

At session start, the agent sees 32 tools (14 category dispatchers + 18 kernel). Each category groups related operations:

filesystem(operation="list_directory", path="/Users")
registry(operation="get_persistence_keys")
event_logs(operation="parse_event_log", path="/Windows/System32/winevt/Logs/Security.evtx")

The capability kernel still blocks out-of-order calls — methodology is enforced regardless of what the agent tries.

FARE: Forensic Abductive Reasoning Engine

The server doesn't just run tools — it reasons about the investigation state using published mathematical frameworks:

Component	Theory	What it does
DSmT/PCR5 Fusion	Dezert-Smarandache (2006)	Combines contradictory tool outputs without Zadeh's paradox
Active Inference (EFE)	Friston (2015)	Selects the tool that maximally reduces hypothesis uncertainty
Rough-Set Confidence	Pawlak (1982)	Maps findings to CONFIRMED/SUPPORTED/INFERRED tiers with mathematical grounding
Convergence Detection	Shannon (1948)	Tracks investigation entropy; detects learning, plateau, and divergence
Cognitive Bias Monitoring	Kahneman & Tversky (1974)	Detects confirmation bias, anchoring, and tunnel vision in the investigation
Auto-Correlation	Carrier (2006) + ATT&CK	Temporal + kill-chain + entity linking across findings

Every tool response includes the FARE state: entropy, dominant hypothesis, active signals, conflict coefficient, and bias warnings.

Rich Investigation Directives

suggest_next_action returns a full forensic cognitive frame — not just "run this tool":

{
  "tool": "registry",
  "operation": "get_persistence_keys",
  "directive": {
    "whatEvilLooksLike": "Entries pointing to temp dirs, random names, encoded PowerShell",
    "whatNormalLooksLike": "Chrome Update, Windows Defender, Office updaters",
    "hypothesisTested": "Has the attacker established registry-based persistence?",
    "confirmationCriteria": "Run key value points to a path containing suspicious executables",
    "ifConfirmed": "Register T1547.001 finding, advance to scheduled tasks",
    "ifAbsent": "Persistence is NOT via registry — check scheduled tasks"
  },
  "efe_score": 0.34,
  "information_gain": "EFE=0.34 (risk=0.20, ambiguity=0.14)"
}

The methodology engine drives the investigation — any MCP-capable LLM follows the structured directives without needing forensic domain knowledge.

Confidence Scoring

Level	Criteria
HYPOTHESIZED	0 evidence links — investigation marker only
INFERRED	1 evidence source
SUPPORTED	2+ sources, same category
CONFIRMED	2+ sources, different categories

Findings cannot appear in the final report without evidence links. Hallucinations are structurally impossible.

Evidence Ledger

Every tool call produces a hash-chained ledger entry:

Tool name + parameters
Output hash (SHA-256)
Previous entry hash (tamper detection)
Capabilities held at time of execution
Duration

Run verify_chain to cryptographically validate the entire audit trail.

Workflows (15)

#	Workflow	Tools	Description
1	Acquisition	5	Mount, verify, partition discovery
2	Filesystem	9	Directory listing, file extraction, deleted recovery
3	Timeline	6	Super timeline, filtering, anomaly detection
4	Registry	9	Hives, persistence, user activity, USB
5	Event Logs	8	EVTX, logon correlation, PowerShell
6	Execution	7	Prefetch, Amcache, ShimCache, SRUM
7	Persistence	9	YARA, scheduled tasks, WMI, BITS
8	Memory	11	Processes, injection, rootkits, network
9	Network	8	PCAP, beaconing, DNS, HTTP
10	Browser	6	History, downloads, cache, extensions
11	User Activity	8	LNK, jumplists, shellbags, recycle bin
12	Anti-Forensics	7	Timestomping, log clearing, wiping
13	Correlation	7	Attack narrative, lateral movement, MITRE
14	Linux	8	Auth, syslog, bash, cron, systemd
15	Reporting	21	Coverage, confidence, provenance, export

Academic Foundations & Novel Contributions

This project implements the first computational realization of several theoretical frameworks for DFIR:

Framework	Citation	Our novel application
Hypothesis-based forensic investigation	Carrier 2006 (Purdue/CERIAS PhD)	First computational implementation — his model was purely theoretical
DSmT/PCR5 evidence fusion	Smarandache & Dezert 2006	Applied to forensic tool outputs (prior: only VBIED/radar)
Active Inference for action selection	Friston 2015 (Free Energy Principle)	First application to DFIR tool orchestration
Rough-set decision approximations	Pawlak 1982	Maps to forensic confidence tiers + stop criterion
External self-correction	Huang et al. 2025 (arXiv:2601.00828)	Server provides external correction signals (not LLM self-correcting)
MCP inference constraint levels	Hilgert et al. 2025 (arXiv:2506.00274)	Implemented Level 3-4 constraint with metadata reporting
Nondeterminism mitigation	Gruber & Hilgert 2026 (arXiv:2604.05589)	Determinism score quantifying methodology adherence
Knowledge-execution gap	DFIR-Metric 2025 (arXiv:2505.19973)	Rich directives bridge the 70% knowledge → 20% execution gap
Multi-agent forensic specialisation	CyberSleuth 2025 (arXiv:2508.20643)	Category dispatchers = lightweight specialist decomposition

Security Properties

Formally verified via property-based tests (npm test):

P1: No shell execution capability exists (no code path)
P2: No finding without evidence links enters the report
P3: Capability graph is a valid DAG (no cycles)
P4: Hash chain valid for any operation sequence
P5: Every tool call produces exactly one ledger entry
P6: All file access contained within evidence mount prefix

Development

# Type check
npx tsc --noEmit

# Run tests
npx vitest run

# Watch mode
npx vitest

Tech Stack

| Package | Purpose | |---------|---------|
| @modelcontextprotocol/sdk | MCP server (stdio + HTTP/SSE transport) | | zod | Runtime validation + branded types | | better-sqlite3 | Evidence ledger (WAL mode, hash-chained) | | neverthrow | Type-safe Result<T, E> — no exceptions in domain layer | | nanoid | Unique IDs for ledger entries + findings | | exponential-backoff | Retry transient I/O failures with jitter | | vitest + fast-check | Testing + property-based formal verification |

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured