disk-forensics-mcp-server
A high-performance MCP Server for Disk Forensics that enables AI agents to analyze disk images through the Model Context Protocol.
README
MCP Disk Forensics
A high-performance MCP Server for Disk Forensics that enables AI agents to analyze disk images through the Model Context Protocol. Built with pytsk3 integration and intelligent caching for maximum speed.
Features
- High Performance: Global handler caching with 279-235,000x speedup for repeated operations
- Multi-Format Support: RAW, E01, VMDK, VHD/VHDX, AD1 formats
- Deep File Inspection: Access to all filesystem structures (NTFS, FAT, ext, etc.)
- Advanced Filtering: Search by extension, timestamp, and deleted files
- Security First: Path validation, size limits, input sanitization
- Memory Efficient: LRU cache with 500,000 entries for large images
- Cycle-Safe Traversal: Depth-limited, inode-guarded directory walking (handles NTFS junctions/symlinks)
- UTC Timestamps: Timezone-aware (UTC) timestamps for correct cross-timezone timeline analysis
Performance Benchmarks
Tested on a 3.7GB AD1 image with 1,587 directories and 19,882 files:
| Operation | Cold | Warm | Speedup |
|---|---|---|---|
| List root | 7.2s | 0.000s | 235,877x |
| Full traversal | 29.0s | 0.104s | 279x |
| Repeated access | 7.2s | 0.000s | 235,877x |
| Deep path (5 levels) | 7.1s | 0.000s | 173,631x |
Requirements
- Python 3.10+
- pytsk3 and forensic libraries installed
- MCP-compatible client (Claude Desktop, VSCode, Cline, etc.)
Installation
1. Install Dependencies
Ubuntu/Debian:
sudo apt-get update
sudo apt-get install python3-dev libtsk-dev
macOS:
brew install sleuthkit
Windows: Install Python 3.10+ from python.org
2. Install MCP Server
# Clone repository
git clone https://github.com/jus1-c/disk-forensics-mcp-server.git
cd disk-forensics-mcp-server
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
# or: venv\Scripts\activate # Windows
# Install package
pip install -e ".[forensics]"
Configuration
Claude Desktop
Edit claude_desktop_config.json:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json
{
"mcpServers": {
"disk-forensics": {
"command": "disk-forensics-mcp-server",
"disabled": false,
"autoApprove": []
}
}
}
VSCode (with Cline extension)
Add to your settings:
{
"mcpServers": {
"disk-forensics": {
"command": "disk-forensics-mcp-server",
"disabled": false,
"autoApprove": []
}
}
}
OpenCode
Add to ~/.opencode/opencode.json:
{
"mcp": {
"disk-forensics": {
"type": "local",
"command": ["disk-forensics-mcp-server"],
"enabled": true,
"timeout": 150000
}
}
}
Available Tools
1. analyze_disk_image
Analyze a disk image and return information about its format, size, and structure.
Parameters:
image_path: Absolute path to disk image (required)
Example:
{
"image_path": "/home/user/evidence/image.raw"
}
2. list_partitions
List partitions in a disk image. Supports MBR and GPT partition tables.
Parameters:
image_path: Absolute path to disk image
Example:
{
"image_path": "/home/user/evidence/image.raw"
}
3. list_files
List files in a directory with caching support.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Offset of the partition in bytespath: Directory path to list (default: "/")
Example:
{
"image_path": "/home/user/evidence/image.raw",
"partition_offset": 1048576,
"path": "/Windows/System32"
}
4. get_file_metadata
Get metadata for a specific file.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Offset of the partition in bytesfile_path: Path to the file within the partition
Example:
{
"image_path": "/home/user/evidence/image.raw",
"partition_offset": 1048576,
"file_path": "/Windows/System32/config/SAM"
}
5. read_file_content
Read content of a specific file.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Offset of the partition in bytesfile_path: Path to the file within the partitionmax_size: Maximum bytes to read (default: 1MB)
Example:
{
"image_path": "/home/user/evidence/image.raw",
"partition_offset": 1048576,
"file_path": "/Windows/System32/config/SAM",
"max_size": 1048576
}
6. extract_file
Extract a file from the image to a destination path.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Offset of the partition in bytesfile_path: Path to the file within the partitionoutput_path: Path to save the extracted file
Example:
{
"image_path": "/home/user/evidence/image.raw",
"partition_offset": 1048576,
"file_path": "/Windows/System32/config/SAM",
"output_path": "/home/user/extracted/SAM"
}
7. extract_directory
Extract a directory from the image to a destination path while preserving relative paths.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Offset of the partition in bytesdirectory_path: Path to the directory within the partitionoutput_path: Path to save the extracted directory
Example:
{
"image_path": "/home/user/evidence/image.raw",
"partition_offset": 1048576,
"directory_path": "/Windows/System32/config",
"output_path": "/home/user/extracted/config"
}
8. get_directory_tree
Get complete directory tree structure.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Offset of the partition in bytespath: Starting path (default: "/")max_depth: Maximum recursion depth (default: 10)
Example:
{
"image_path": "/home/user/evidence/image.raw",
"partition_offset": 1048576,
"path": "/Windows",
"max_depth": 3
}
9. search_by_extension
Search files by extension.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Offset of the partition in bytesextension: File extension to search for (e.g., "exe", "txt")path: Starting path (default: "/")
Example:
{
"image_path": "/home/user/evidence/image.raw",
"partition_offset": 1048576,
"extension": "exe",
"path": "/Windows"
}
10. search_by_timestamp
Search files by date range.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Offset of the partition in bytesstart_time: Start time (ISO format)end_time: End time (ISO format)timestamp_type: Type to check - "created", "modified", "accessed", or "any"path: Starting path (default: "/")
Example:
{
"image_path": "/home/user/evidence/image.raw",
"partition_offset": 1048576,
"start_time": "2024-01-01T00:00:00",
"end_time": "2024-12-31T23:59:59",
"timestamp_type": "modified",
"path": "/Windows"
}
11. scan_deleted_files
Scan for deleted files.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Offset of the partition in bytespath: Starting path (default: "/")max_results: Maximum results to return (default: 100)
Example:
{
"image_path": "/home/user/evidence/image.raw",
"partition_offset": 1048576,
"path": "/",
"max_results": 100
}
12. calculate_hash
Calculate hash (MD5, SHA1, or SHA256) of a disk image.
Parameters:
image_path: Absolute path to disk imagealgorithm: Hash algorithm - "md5", "sha1", or "sha256" (default: "sha256")
Example:
{
"image_path": "/home/user/evidence/image.raw",
"algorithm": "sha256"
}
13. hash_files
Hash files inside a partition: a single file or a whole directory tree
(recursive, streamed). Optionally match each hash against a known hashset —
known_bad flags files in the set, known_good flags files NOT in the
allowlist. Bulk results paginate with a true total_files count. Use
calculate_hash instead to hash the entire image container.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Partition offset in bytesfile_path: Single file to hash (mutually exclusive withdirectory_path)directory_path: Directory to hash recursively (mutually exclusive withfile_path)algorithm: "md5", "sha1", or "sha256" (default: "sha256")match_hashset: Optional list of known hashes (hex) to match againsthashset_file: Optional host path to a newline-delimited hash list (merged withmatch_hashset)match_mode: "known_bad" (flag matches) or "known_good" (flag non-matches) (default: "known_bad")max_files: Max results per page (default: 1000)offset: Result offset for pagination (default: 0)
Example:
{
"image_path": "/home/user/evidence/image.raw",
"partition_offset": 1048576,
"directory_path": "/Windows/System32",
"match_hashset": ["e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"],
"match_mode": "known_bad"
}
14. search_content
Search the byte content of files across a partition for a text, regex, or hex pattern (grep-in-image). Scans every file fully by default with streaming reads so large files never exhaust memory, and never misses a match that straddles a read boundary. Returns the true total match count plus a paginated slice.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Partition offset in bytespattern: Text, regex, or hex pattern to search forpath: Directory to search under, recursive (default: "/")is_regex: Treat pattern as a regular expression (default: false)is_hex: Treat pattern as hex bytes, e.g. "4d5a90" (default: false)case_sensitive: Case-sensitive match for text (default: false)extensions: Only scan files with these extensionsmin_size/max_file_size: Size filters in bytes (0 = scan everything)max_matches: Max matches per page (default: 500)offset: Result offset for pagination (default: 0)
Example:
{
"image_path": "/home/user/evidence/image.raw",
"partition_offset": 1048576,
"pattern": "password=",
"extensions": ["txt", "ini", "config"]
}
15. recover_deleted_file
Recover the content of a deleted file by its inode / meta_addr (as reported by
scan_deleted_files) and write it to a host path. Opens the file by metadata
address even when its name is unlinked. Recovery is best-effort: it succeeds when
the original clusters are intact and may be partial if they were reused.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Partition offset in bytesinode: Inode / meta_addr of the deleted file (fromscan_deleted_files)output_path: Host path to write recovered content tomax_size: Optional cap on bytes to recover (default: full file)
Example:
{
"image_path": "/home/user/evidence/image.raw",
"partition_offset": 1048576,
"inode": 12345,
"output_path": "/home/user/recovered/file.bin"
}
16. extract_artifacts
Locate and extract well-known Windows forensic artifacts (registry hives, event logs, execution evidence, browser profiles) from a partition to a host directory. Output mirrors the source layout so it can be fed directly into downstream tooling. Paths resolve case-insensitively and per-user artifacts expand across all user profiles.
Parameters:
image_path: Absolute path to disk imagepartition_offset: Partition offset in bytesoutput_path: Host directory to write extracted artifacts intopreset: "registry", "eventlogs", "execution", "browser", or "all" (default: "all")max_bytes: Optional cap on total bytes written
Example:
{
"image_path": "/home/user/evidence/windows.raw",
"partition_offset": 1048576,
"output_path": "/home/user/artifacts",
"preset": "registry"
}
17. cache_stats
Inspect the server's caches: the process-wide handler cache, each handler's
file/metadata LRU counters (hit rate), and the AD1 index cache. Pass image_path
for one handler's detailed stats, or clear=true to free cached handlers (and
the AD1 index when clearing all).
Parameters:
image_path: Optional — report this image's per-handler stats (omit for all)clear: Clear handler caches and the AD1 index cache after reporting (default: false)
Example:
{
"clear": false
}
Usage Examples
Basic Analysis
Please analyze this disk image and show me the partition layout.
Image: /home/user/evidence/image.raw
File Extraction
Extract the SAM file from this Windows image.
Image: /home/user/evidence/windows.raw
Partition offset: 1048576
Timeline Analysis
Find all files modified between January 1, 2024 and March 1, 2024.
Image: /home/user/evidence/image.raw
Partition offset: 1048576
Malware Hunting
Search for all executable files in the Windows directory.
Image: /home/user/evidence/suspicious.raw
Extension: exe
Path: /Windows
Deleted File Recovery
Scan for deleted files in the root directory.
Image: /home/user/evidence/image.raw
Partition offset: 1048576
Security Features
- Read Limits: Configurable max file size per read (default: 1MB) with streaming chunked I/O
- Cache Limits: LRU cache with 500,000 entries and batched eviction
- Cycle Protection: Depth-limited, inode-guarded traversal prevents junction/symlink loops
- Timeout Protection: Request timeout configuration (default: 150s)
- Graceful Shutdown: Proper resource cleanup on exit
- stderr Logging: Diagnostics never pollute the stdout JSON-RPC channel
Note: extraction tools (
extract_file,extract_directory) write to caller-supplied host paths. They are intended for trusted, local forensic workstations and do not sandbox the output path. Run the server under an account with appropriately scoped filesystem permissions.
Architecture
┌─────────────────┐ ┌──────────────────┐ ┌─────────────┐
│ MCP Client │────▶│ MCP Server │────▶│ pytsk3 │
│ (Claude/VSCode) │ │ (Python/MCP) │ │ (The Sleuth│
└─────────────────┘ └──────────────────┘ │ Kit) │
│ └─────────────┘
▼ │
┌─────────────────┐ │
│ Global Cache │ │
│ (500K entries) │ │
└─────────────────┘ │
│ ▼
▼ ┌──────────────┐
┌──────────────┐ │ Disk Image │
│ Partition FS │ │ (RAW/E01/ │
│ Handles │ │ VMDK/AD1) │
└──────────────┘ └──────────────┘
Project Structure
disk-forensics-mcp-server/
├── disk_forensics_mcp_server/
│ ├── __init__.py # Single source of __version__
│ ├── __main__.py # Entry point
│ ├── handlers/ # Image format handlers
│ │ ├── base_handler.py # Abstract base + shared pytsk3 logic, safe walk()
│ │ ├── raw_handler.py
│ │ ├── e01_handler.py
│ │ ├── vmdk_handler.py
│ │ ├── vhd_handler.py
│ │ └── ad1_handler.py
│ ├── models/ # Pydantic schemas
│ │ └── schemas.py
│ ├── server/ # MCP server implementation
│ │ └── mcp_server.py
│ ├── tools/ # Forensics tools
│ │ ├── disk_tools/
│ │ │ ├── analyze_image.py
│ │ │ └── list_partitions.py
│ │ ├── filesystem_tools/
│ │ │ ├── list_files.py
│ │ │ ├── read_file.py # read_file_content tool
│ │ │ ├── extract_file.py
│ │ │ ├── extract_directory.py
│ │ │ ├── get_directory_tree.py
│ │ │ ├── get_file_metadata.py
│ │ │ ├── search_by_extension.py
│ │ │ ├── search_by_timestamp.py
│ │ │ ├── search_content.py # grep-in-image (streaming, boundary-safe)
│ │ │ ├── scan_deleted_files.py
│ │ │ └── recover_deleted_file.py # recover content by inode
│ │ ├── hash_tools/
│ │ │ ├── calculate_hash.py # hash the whole image container
│ │ │ └── hash_files.py # hash files inside a partition + hashset match
│ │ ├── artifact_tools/
│ │ │ └── extract_artifacts.py # Windows artifact extraction
│ │ └── system_tools/
│ │ └── cache_stats.py # inspect/clear server caches
│ └── utils/ # Utilities
│ ├── image_detector.py # Format detection + global handler cache
│ └── logging.py # stderr logging (never print() on stdio transport)
├── tests/
├── scripts/
├── pyproject.toml
└── README.md
Development
Setup Development Environment
pip install -e ".[forensics,dev]"
Code Quality
black src
isort src
flake8 src
mypy src
Troubleshooting
pytsk3 not found
# Install forensics dependencies
pip install ".[forensics]"
# On Ubuntu/Debian
sudo apt-get install python3-dev libtsk-dev
# On macOS
brew install sleuthkit
Permission denied when accessing disk images
# Run with appropriate permissions
sudo disk-forensics-mcp-server
# Or copy image to user directory
cp /path/to/image.raw ~/evidence/
Slow performance on first access
This is expected. First access reads from disk, subsequent accesses use cache.
- Cold: ~7-29s for full traversal
- Warm: ~0.000s from cache
Changelog
Unreleased - Stability & forensic correctness
- Diagnostics now log to stderr (never stdout), keeping the stdio JSON-RPC channel clean
- All filesystem timestamps emitted as UTC, timezone-aware for correct timeline analysis
- Recursive search/scan/tree operations are depth-bounded and cycle-guarded (NTFS junction/reparse loops no longer cause infinite recursion)
- Removed unused parallel-processing scaffolding
is_deletednow flagged during normal directory listings- Log level configurable via
DISK_FORENSICS_LOG_LEVEL(defaultWARNING) - New tools:
extract_artifacts(Windows artifact extraction),search_content(grep-in-image, streaming and boundary-safe),recover_deleted_file(recover content by inode),hash_files(per-file hashing with hashset matching), andcache_stats(inspect/clear server caches) — 17 tools total
v0.2.0 - Major Performance Improvements
- Global handler caching with 279x - 235,000x speedup
- Increased cache capacity to 500,000 entries with LRU eviction
- Cache statistics monitoring
- Graceful shutdown with signal handling
v0.1.0 - Initial Release
- Support for RAW, E01, VMDK, VHD/VHDX, AD1 formats
- Disk analysis: analyze_disk_image, list_partitions, calculate_hash
- Filesystem tools: list_files, get_file_metadata, read_file_content, extract_file
- Directory tree traversal and search tools
- Deleted file scanning
- Basic caching with 187x speedup
License
MIT License - see LICENSE file for details.
Acknowledgments
- The Sleuth Kit - Forensic toolkit
- pytsk3 - Python bindings for TSK
- pyad1 - AD1 (AccessData Format) parser - Used as reference for AD1 handler implementation
- Model Context Protocol - MCP specification
- python-mcp - Python MCP SDK
Support
For issues and feature requests, please use the GitHub issue tracker.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.