disk-forensics-mcp-server

disk-forensics-mcp-server

A high-performance MCP Server for Disk Forensics that enables AI agents to analyze disk images through the Model Context Protocol.

Category
Visit Server

README

MCP Disk Forensics

Python 3.10+ License: MIT MCP

A high-performance MCP Server for Disk Forensics that enables AI agents to analyze disk images through the Model Context Protocol. Built with pytsk3 integration and intelligent caching for maximum speed.

Features

  • High Performance: Global handler caching with 279-235,000x speedup for repeated operations
  • Multi-Format Support: RAW, E01, VMDK, VHD/VHDX, AD1 formats
  • Deep File Inspection: Access to all filesystem structures (NTFS, FAT, ext, etc.)
  • Advanced Filtering: Search by extension, timestamp, and deleted files
  • Security First: Path validation, size limits, input sanitization
  • Memory Efficient: LRU cache with 500,000 entries for large images
  • Cycle-Safe Traversal: Depth-limited, inode-guarded directory walking (handles NTFS junctions/symlinks)
  • UTC Timestamps: Timezone-aware (UTC) timestamps for correct cross-timezone timeline analysis

Performance Benchmarks

Tested on a 3.7GB AD1 image with 1,587 directories and 19,882 files:

Operation Cold Warm Speedup
List root 7.2s 0.000s 235,877x
Full traversal 29.0s 0.104s 279x
Repeated access 7.2s 0.000s 235,877x
Deep path (5 levels) 7.1s 0.000s 173,631x

Requirements

  • Python 3.10+
  • pytsk3 and forensic libraries installed
  • MCP-compatible client (Claude Desktop, VSCode, Cline, etc.)

Installation

1. Install Dependencies

Ubuntu/Debian:

sudo apt-get update
sudo apt-get install python3-dev libtsk-dev

macOS:

brew install sleuthkit

Windows: Install Python 3.10+ from python.org

2. Install MCP Server

# Clone repository
git clone https://github.com/jus1-c/disk-forensics-mcp-server.git
cd disk-forensics-mcp-server

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/Mac
# or: venv\Scripts\activate  # Windows

# Install package
pip install -e ".[forensics]"

Configuration

Claude Desktop

Edit claude_desktop_config.json:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "disk-forensics": {
      "command": "disk-forensics-mcp-server",
      "disabled": false,
      "autoApprove": []
    }
  }
}

VSCode (with Cline extension)

Add to your settings:

{
  "mcpServers": {
    "disk-forensics": {
      "command": "disk-forensics-mcp-server",
      "disabled": false,
      "autoApprove": []
    }
  }
}

OpenCode

Add to ~/.opencode/opencode.json:

{
  "mcp": {
    "disk-forensics": {
      "type": "local",
      "command": ["disk-forensics-mcp-server"],
      "enabled": true,
      "timeout": 150000
    }
  }
}

Available Tools

1. analyze_disk_image

Analyze a disk image and return information about its format, size, and structure.

Parameters:

  • image_path: Absolute path to disk image (required)

Example:

{
  "image_path": "/home/user/evidence/image.raw"
}

2. list_partitions

List partitions in a disk image. Supports MBR and GPT partition tables.

Parameters:

  • image_path: Absolute path to disk image

Example:

{
  "image_path": "/home/user/evidence/image.raw"
}

3. list_files

List files in a directory with caching support.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Offset of the partition in bytes
  • path: Directory path to list (default: "/")

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "partition_offset": 1048576,
  "path": "/Windows/System32"
}

4. get_file_metadata

Get metadata for a specific file.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Offset of the partition in bytes
  • file_path: Path to the file within the partition

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "partition_offset": 1048576,
  "file_path": "/Windows/System32/config/SAM"
}

5. read_file_content

Read content of a specific file.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Offset of the partition in bytes
  • file_path: Path to the file within the partition
  • max_size: Maximum bytes to read (default: 1MB)

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "partition_offset": 1048576,
  "file_path": "/Windows/System32/config/SAM",
  "max_size": 1048576
}

6. extract_file

Extract a file from the image to a destination path.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Offset of the partition in bytes
  • file_path: Path to the file within the partition
  • output_path: Path to save the extracted file

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "partition_offset": 1048576,
  "file_path": "/Windows/System32/config/SAM",
  "output_path": "/home/user/extracted/SAM"
}

7. extract_directory

Extract a directory from the image to a destination path while preserving relative paths.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Offset of the partition in bytes
  • directory_path: Path to the directory within the partition
  • output_path: Path to save the extracted directory

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "partition_offset": 1048576,
  "directory_path": "/Windows/System32/config",
  "output_path": "/home/user/extracted/config"
}

8. get_directory_tree

Get complete directory tree structure.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Offset of the partition in bytes
  • path: Starting path (default: "/")
  • max_depth: Maximum recursion depth (default: 10)

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "partition_offset": 1048576,
  "path": "/Windows",
  "max_depth": 3
}

9. search_by_extension

Search files by extension.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Offset of the partition in bytes
  • extension: File extension to search for (e.g., "exe", "txt")
  • path: Starting path (default: "/")

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "partition_offset": 1048576,
  "extension": "exe",
  "path": "/Windows"
}

10. search_by_timestamp

Search files by date range.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Offset of the partition in bytes
  • start_time: Start time (ISO format)
  • end_time: End time (ISO format)
  • timestamp_type: Type to check - "created", "modified", "accessed", or "any"
  • path: Starting path (default: "/")

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "partition_offset": 1048576,
  "start_time": "2024-01-01T00:00:00",
  "end_time": "2024-12-31T23:59:59",
  "timestamp_type": "modified",
  "path": "/Windows"
}

11. scan_deleted_files

Scan for deleted files.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Offset of the partition in bytes
  • path: Starting path (default: "/")
  • max_results: Maximum results to return (default: 100)

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "partition_offset": 1048576,
  "path": "/",
  "max_results": 100
}

12. calculate_hash

Calculate hash (MD5, SHA1, or SHA256) of a disk image.

Parameters:

  • image_path: Absolute path to disk image
  • algorithm: Hash algorithm - "md5", "sha1", or "sha256" (default: "sha256")

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "algorithm": "sha256"
}

13. hash_files

Hash files inside a partition: a single file or a whole directory tree (recursive, streamed). Optionally match each hash against a known hashset — known_bad flags files in the set, known_good flags files NOT in the allowlist. Bulk results paginate with a true total_files count. Use calculate_hash instead to hash the entire image container.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Partition offset in bytes
  • file_path: Single file to hash (mutually exclusive with directory_path)
  • directory_path: Directory to hash recursively (mutually exclusive with file_path)
  • algorithm: "md5", "sha1", or "sha256" (default: "sha256")
  • match_hashset: Optional list of known hashes (hex) to match against
  • hashset_file: Optional host path to a newline-delimited hash list (merged with match_hashset)
  • match_mode: "known_bad" (flag matches) or "known_good" (flag non-matches) (default: "known_bad")
  • max_files: Max results per page (default: 1000)
  • offset: Result offset for pagination (default: 0)

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "partition_offset": 1048576,
  "directory_path": "/Windows/System32",
  "match_hashset": ["e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"],
  "match_mode": "known_bad"
}

14. search_content

Search the byte content of files across a partition for a text, regex, or hex pattern (grep-in-image). Scans every file fully by default with streaming reads so large files never exhaust memory, and never misses a match that straddles a read boundary. Returns the true total match count plus a paginated slice.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Partition offset in bytes
  • pattern: Text, regex, or hex pattern to search for
  • path: Directory to search under, recursive (default: "/")
  • is_regex: Treat pattern as a regular expression (default: false)
  • is_hex: Treat pattern as hex bytes, e.g. "4d5a90" (default: false)
  • case_sensitive: Case-sensitive match for text (default: false)
  • extensions: Only scan files with these extensions
  • min_size / max_file_size: Size filters in bytes (0 = scan everything)
  • max_matches: Max matches per page (default: 500)
  • offset: Result offset for pagination (default: 0)

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "partition_offset": 1048576,
  "pattern": "password=",
  "extensions": ["txt", "ini", "config"]
}

15. recover_deleted_file

Recover the content of a deleted file by its inode / meta_addr (as reported by scan_deleted_files) and write it to a host path. Opens the file by metadata address even when its name is unlinked. Recovery is best-effort: it succeeds when the original clusters are intact and may be partial if they were reused.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Partition offset in bytes
  • inode: Inode / meta_addr of the deleted file (from scan_deleted_files)
  • output_path: Host path to write recovered content to
  • max_size: Optional cap on bytes to recover (default: full file)

Example:

{
  "image_path": "/home/user/evidence/image.raw",
  "partition_offset": 1048576,
  "inode": 12345,
  "output_path": "/home/user/recovered/file.bin"
}

16. extract_artifacts

Locate and extract well-known Windows forensic artifacts (registry hives, event logs, execution evidence, browser profiles) from a partition to a host directory. Output mirrors the source layout so it can be fed directly into downstream tooling. Paths resolve case-insensitively and per-user artifacts expand across all user profiles.

Parameters:

  • image_path: Absolute path to disk image
  • partition_offset: Partition offset in bytes
  • output_path: Host directory to write extracted artifacts into
  • preset: "registry", "eventlogs", "execution", "browser", or "all" (default: "all")
  • max_bytes: Optional cap on total bytes written

Example:

{
  "image_path": "/home/user/evidence/windows.raw",
  "partition_offset": 1048576,
  "output_path": "/home/user/artifacts",
  "preset": "registry"
}

17. cache_stats

Inspect the server's caches: the process-wide handler cache, each handler's file/metadata LRU counters (hit rate), and the AD1 index cache. Pass image_path for one handler's detailed stats, or clear=true to free cached handlers (and the AD1 index when clearing all).

Parameters:

  • image_path: Optional — report this image's per-handler stats (omit for all)
  • clear: Clear handler caches and the AD1 index cache after reporting (default: false)

Example:

{
  "clear": false
}

Usage Examples

Basic Analysis

Please analyze this disk image and show me the partition layout.
Image: /home/user/evidence/image.raw

File Extraction

Extract the SAM file from this Windows image.
Image: /home/user/evidence/windows.raw
Partition offset: 1048576

Timeline Analysis

Find all files modified between January 1, 2024 and March 1, 2024.
Image: /home/user/evidence/image.raw
Partition offset: 1048576

Malware Hunting

Search for all executable files in the Windows directory.
Image: /home/user/evidence/suspicious.raw
Extension: exe
Path: /Windows

Deleted File Recovery

Scan for deleted files in the root directory.
Image: /home/user/evidence/image.raw
Partition offset: 1048576

Security Features

  • Read Limits: Configurable max file size per read (default: 1MB) with streaming chunked I/O
  • Cache Limits: LRU cache with 500,000 entries and batched eviction
  • Cycle Protection: Depth-limited, inode-guarded traversal prevents junction/symlink loops
  • Timeout Protection: Request timeout configuration (default: 150s)
  • Graceful Shutdown: Proper resource cleanup on exit
  • stderr Logging: Diagnostics never pollute the stdout JSON-RPC channel

Note: extraction tools (extract_file, extract_directory) write to caller-supplied host paths. They are intended for trusted, local forensic workstations and do not sandbox the output path. Run the server under an account with appropriately scoped filesystem permissions.

Architecture

┌─────────────────┐     ┌──────────────────┐     ┌─────────────┐
│   MCP Client    │────▶│  MCP Server      │────▶│   pytsk3    │
│ (Claude/VSCode) │     │  (Python/MCP)    │     │  (The Sleuth│
└─────────────────┘     └──────────────────┘     │   Kit)      │
                               │                 └─────────────┘
                               ▼                          │
                        ┌─────────────────┐               │
                        │  Global Cache   │               │
                        │  (500K entries) │               │
                        └─────────────────┘               │
                               │                          ▼
                               ▼                  ┌──────────────┐
                        ┌──────────────┐          │  Disk Image  │
                        │ Partition FS │          │  (RAW/E01/   │
                        │   Handles    │          │  VMDK/AD1)   │
                        └──────────────┘          └──────────────┘

Project Structure

disk-forensics-mcp-server/
├── disk_forensics_mcp_server/
│   ├── __init__.py          # Single source of __version__
│   ├── __main__.py          # Entry point
│   ├── handlers/            # Image format handlers
│   │   ├── base_handler.py  # Abstract base + shared pytsk3 logic, safe walk()
│   │   ├── raw_handler.py
│   │   ├── e01_handler.py
│   │   ├── vmdk_handler.py
│   │   ├── vhd_handler.py
│   │   └── ad1_handler.py
│   ├── models/              # Pydantic schemas
│   │   └── schemas.py
│   ├── server/              # MCP server implementation
│   │   └── mcp_server.py
│   ├── tools/               # Forensics tools
│   │   ├── disk_tools/
│   │   │   ├── analyze_image.py
│   │   │   └── list_partitions.py
│   │   ├── filesystem_tools/
│   │   │   ├── list_files.py
│   │   │   ├── read_file.py            # read_file_content tool
│   │   │   ├── extract_file.py
│   │   │   ├── extract_directory.py
│   │   │   ├── get_directory_tree.py
│   │   │   ├── get_file_metadata.py
│   │   │   ├── search_by_extension.py
│   │   │   ├── search_by_timestamp.py
│   │   │   ├── search_content.py       # grep-in-image (streaming, boundary-safe)
│   │   │   ├── scan_deleted_files.py
│   │   │   └── recover_deleted_file.py # recover content by inode
│   │   ├── hash_tools/
│   │   │   ├── calculate_hash.py       # hash the whole image container
│   │   │   └── hash_files.py           # hash files inside a partition + hashset match
│   │   ├── artifact_tools/
│   │   │   └── extract_artifacts.py    # Windows artifact extraction
│   │   └── system_tools/
│   │       └── cache_stats.py          # inspect/clear server caches
│   └── utils/               # Utilities
│       ├── image_detector.py  # Format detection + global handler cache
│       └── logging.py         # stderr logging (never print() on stdio transport)
├── tests/
├── scripts/
├── pyproject.toml
└── README.md

Development

Setup Development Environment

pip install -e ".[forensics,dev]"

Code Quality

black src
isort src
flake8 src
mypy src

Troubleshooting

pytsk3 not found

# Install forensics dependencies
pip install ".[forensics]"

# On Ubuntu/Debian
sudo apt-get install python3-dev libtsk-dev

# On macOS
brew install sleuthkit

Permission denied when accessing disk images

# Run with appropriate permissions
sudo disk-forensics-mcp-server

# Or copy image to user directory
cp /path/to/image.raw ~/evidence/

Slow performance on first access

This is expected. First access reads from disk, subsequent accesses use cache.

  • Cold: ~7-29s for full traversal
  • Warm: ~0.000s from cache

Changelog

Unreleased - Stability & forensic correctness

  • Diagnostics now log to stderr (never stdout), keeping the stdio JSON-RPC channel clean
  • All filesystem timestamps emitted as UTC, timezone-aware for correct timeline analysis
  • Recursive search/scan/tree operations are depth-bounded and cycle-guarded (NTFS junction/reparse loops no longer cause infinite recursion)
  • Removed unused parallel-processing scaffolding
  • is_deleted now flagged during normal directory listings
  • Log level configurable via DISK_FORENSICS_LOG_LEVEL (default WARNING)
  • New tools: extract_artifacts (Windows artifact extraction), search_content (grep-in-image, streaming and boundary-safe), recover_deleted_file (recover content by inode), hash_files (per-file hashing with hashset matching), and cache_stats (inspect/clear server caches) — 17 tools total

v0.2.0 - Major Performance Improvements

  • Global handler caching with 279x - 235,000x speedup
  • Increased cache capacity to 500,000 entries with LRU eviction
  • Cache statistics monitoring
  • Graceful shutdown with signal handling

v0.1.0 - Initial Release

  • Support for RAW, E01, VMDK, VHD/VHDX, AD1 formats
  • Disk analysis: analyze_disk_image, list_partitions, calculate_hash
  • Filesystem tools: list_files, get_file_metadata, read_file_content, extract_file
  • Directory tree traversal and search tools
  • Deleted file scanning
  • Basic caching with 187x speedup

License

MIT License - see LICENSE file for details.

Acknowledgments

Support

For issues and feature requests, please use the GitHub issue tracker.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured