Hammerspace Storage Management MCP Server
Enables natural language management of Hammerspace storage clusters with automated file ingestion, tagging, tier management, and vector embedding generation. Supports real-time file monitoring, multi-format document processing, and Kubernetes-based ingestion workflows with Milvus integration.
README
MCP 1.5 - Advanced Hammerspace Storage Management with AI
A production-ready Model Context Protocol (MCP) server for Hammerspace storage management with natural language interface, automatic file ingestion, and comprehensive monitoring capabilities.
๐ Quick Start
Prerequisites
- Python 3.8 or higher
- Linux system (Ubuntu 20.04+ recommended)
- Hammerspace storage cluster (HSTK CLI installed)
- Anthropic API key (for natural language UI)
- NVIDIA API key (optional, for AI features)
- Kubernetes cluster (for automatic file ingestion)
- Milvus vector database (for document embeddings)
Installation
-
Clone the repository:
git clone https://github.com/mbloomhammerspace/mcp-1.5.git cd mcp-1.5 -
Create virtual environment:
python3 -m venv venv source venv/bin/activate pip install -r requirements.txt -
Configure environment variables:
cp .env.example .env nano .envSet the following:
# Hammerspace Configuration HS_ANVIL=10.200.120.90 # Anthropic API (for Web UI) ANTHROPIC_API_KEY=your_anthropic_key_here # NVIDIA API (optional) NVIDIA_API_KEY=your_nvidia_key_here -
Start all services with one command:
./start_all_services.sh startAccess Web UI at:
http://localhost:5000
๐ Features
๐ค Natural Language Interface
- Web-based UI powered by Claude AI at
http://localhost:5000 - Direct MCP integration for all Hammerspace operations
- Real-time debug log viewer at
/debug - File ingest monitor dashboard at
/monitor - Action-oriented responses - executes commands, reports results
๐ Advanced File Ingestion System
- Multi-format support - BMP, DOCX, HTML, JPEG, JSON, MD, PDF, PNG, PPTX, SH, TIFF, TXT, MP3
- Real-time file monitoring - Polling-based detection on NFS 4.2 mounts
- Folder-based processing - New folders trigger batch processing of all supported files
- Kubernetes job deployment - Automatic file ingestion jobs with ConfigMaps
- Milvus vector database integration - Automatic embedding generation and storage
- Intelligent collection naming - Collections named after folders or sequential numbering
- Tag-based tier management - Automatic tier0 promotion for embedding files
- Event streaming - Real-time monitoring of ingestion pipeline
๐ Comprehensive Monitoring UI
- Real-time event streaming - Watch files being ingested live
- Event filtering - Filter by event type, file pattern, or timestamp
- Toast notifications - Get notified when new files arrive
- Status dashboard - Monitor service health, file counts, CPU usage
- Interactive UI - Beautiful, modern interface for tracking ingestion pipelines
- Multi-service monitoring - Track all MCP servers and services
๐ง Persistent Service Management
- Unified startup script - Start all services with one command
- Screen session persistence - Services survive SSH disconnections
- Auto-restart capability - Services automatically restart if they crash
- Systemd integration - Optional auto-start on boot
- Comprehensive logging - Detailed logs for all services
- Health monitoring - Real-time service status and port monitoring
API Endpoints
/api/monitor/status- Get monitor service status/api/monitor/events- Query ingestion events with filters/api/monitor/events/stream- Server-Sent Events for real-time updates/api/chat- Natural language chat interface/api/tools- List available MCP tools/api/logs/stream- Stream debug logs in real-time
Core MCP Tools
All tools use real HSTK CLI commands (no mock data):
tag_directory_recursive- Tag files and directories recursivelycheck_tagged_files_alignment- Check if tagged files are aligned to their objectivesapply_objective_to_path- Apply Hammerspace objectives (e.g., "Place-on-tier0", "placeonvolumes")remove_objective_from_path- Remove objectives from pathslist_objectives_for_path- List all objectives applied to a pathingest_new_files- Find new files by ctime/mtime, tag them, and place on Tier 1refresh_mounts- Refresh NFS mounts to resolve stale file handles
Advanced Features
- Automatic file monitoring - Polling service that auto-tags new files with MD5 hash and MIME type
- Automatic stale file handle recovery - Detects and automatically refreshes mounts
- Share-relative path support - Use
/modelstore/dirinstead of/mnt/se-lab/modelstore/dir - Multi-tier support - Tier 0 (high-performance) and Tier 1 (default storage)
- Tag-based workflows - Tag files and manage them as collections
- Intelligent batching - Groups file events (15 sec) or processes immediately if traffic is light
๐ Advanced File Ingestion Workflow
How It Works
- File Detection: The file monitor polls NFS mounts every 5 seconds for new files
- Tagging: New files are automatically tagged with:
user.ingestid=<md5hash>- MD5 hash for deduplicationuser.mimeid=<mimetype>- MIME type for classificationuser.embedding- Tag for files requiring embedding
- Multi-format Processing: All supported file types trigger Kubernetes ingestion jobs
- Collection Management: Files are organized into Milvus collections:
- Folder-based: Collections named after folder (e.g.,
cold_0011) - Sequential: Collections named
intel_1,intel_2, etc.
- Folder-based: Collections named after folder (e.g.,
- Tier Management: Files tagged for embedding are automatically promoted to tier0
- Embedding Generation: Files are processed into vector embeddings
- Storage: Embeddings stored in Milvus for semantic search
- Tier Demotion: After embedding, files are moved back to default tier
Folder Processing
When a new folder is detected:
- All supported files in the folder are processed as a batch
- A single Milvus collection is created named after the folder
- Collection names are sanitized (e.g.,
cold-0011โcold_0011) - All files in the folder are uploaded to the same collection
- Files are tagged with
embeddingand promoted to tier0 for processing - After embedding, tier0 objective is removed
Configuration
The file monitor supports:
- Polling intervals: 5 seconds (fast) during business hours, 30 seconds during retroactive hours
- Retroactive tagging: Configurable time windows (default: disabled for testing)
- Path monitoring: Monitors
/mnt/anvil/hub/by default - File filtering: Processes all supported file types (BMP, DOCX, HTML, JPEG, JSON, MD, PDF, PNG, PPTX, SH, TIFF, TXT, MP3)
- Recursive folder tagging: Tags entire folder hierarchies for 40x performance improvement
Kubernetes Integration
- Job Templates: Uses
k8s-templates/ingest.yamlfor PDF processing - ConfigMaps: File lists passed to jobs via Kubernetes ConfigMaps
- PVC Integration: Uses
hammerspace-hub-pvcfor persistent storage - Path Mapping: Container paths mapped from
/mnt/anvil/hub/to/data/
๐ฏ Usage
Example Commands (Web UI)
Try these in the natural language interface:
Tag all files in /modelstore/nvidia-test-thurs as modelsetid=my-demo
Promote all modelsetid=my-demo tagged files to tier0
Check alignment status of files tagged with modelsetid=my-demo
Remove the Place-on-tier0 objective from /modelstore/nvidia-test-thurs
List all objectives for /modelstore/incoming-models
Refresh Hammerspace mounts
๐ File Ingestion Commands
Start the file monitor daemon
Check file monitor status
View recent file ingestion events
Stop the file monitor
Direct MCP Server Usage
Add to your MCP client configuration (e.g., Cursor IDE):
{
"mcpServers": {
"hammerspace": {
"command": "python",
"args": ["/home/mike/mcp-1.5/src/aiq_hstk_mcp_server.py"],
"env": {
"PYTHONPATH": "/home/mike/mcp-1.5"
}
}
}
}
Available MCP Tools
Tag Management
-
tag_directory_recursive- Recursively tag all files in a directory{ "path": "/mnt/se-lab/modelstore/my-models", "tag_name": "user.modelsetid", "tag_value": "my-tag-value" } -
check_tagged_files_alignment- Find and check alignment of tagged files{ "tag_name": "user.modelsetid", "tag_value": "my-tag-value", "share_path": "/mnt/se-lab/modelstore/" }
Tier Management
-
apply_objective_to_path- Apply objectives like tier promotion{ "objective_name": "Place-on-tier0", "path": "/mnt/se-lab/modelstore/my-models" } -
remove_objective_from_path- Remove objectives{ "objective_name": "Place-on-tier0", "path": "/mnt/se-lab/modelstore/my-models" }
File Ingestion
ingest_new_files- Find new files, tag them, and place on Tier 1{ "path": "/mnt/se-lab/modelstore/", "tag_name": "user.modelsetid", "tag_value": "new-batch", "age_minutes": 60 }
System Utilities
refresh_mounts- Refresh NFS mounts to resolve stale file handles{ "mount_type": "all" }
File Monitoring & Ingestion
-
start_inotify_monitor- Start automated file monitoring service{}Monitors all Hammerspace shares, automatically tags new files with:
user.ingestid=<md5hash>- MD5 hash for deduplicationuser.mimeid=<mimetype>- MIME type for classification
Events are batched every 15 seconds or processed immediately if traffic is light.
-
get_file_monitor_status- Get monitor status{}Returns: running state, watched paths, pending events, files tagged, CPU usage
-
get_file_ingest_events- Query file ingest/tagging events{ "limit": 100, "event_type": "NEW_FILE", "file_pattern": ".pdf", "since_timestamp": "2025-10-09T20:00:00" }Returns recent file ingestion events with full metadata including:
- Event type (NEW_FILE or RETROACTIVE_TAG)
- File path and name
- MD5 hash (ingestid)
- MIME type (mimeid)
- File size and timestamp
Use Cases:
- Track which files have been ingested
- Build automated workflows based on file events
- Monitor ingestion pipelines
- Audit file processing
-
stop_inotify_monitor- Stop monitoring service{} -
list_objectives_for_path- List all objectives for a path{ "path": "/mnt/se-lab/modelstore/my-models" }
๐ง Architecture
Components
-
MCP Server (
src/aiq_hstk_mcp_server.py)- Implements MCP protocol
- Executes Hammerspace HSTK CLI commands
- Handles automatic retry on stale file handles
- Uses real HSTK operations (no mock data)
-
Web UI (
web_ui/app.py)- Flask-based natural language interface
- Anthropic Claude integration for NL understanding
- MCP client for tool execution
- Real-time debug log streaming
-
๐ File Monitor (
src/file_monitor.py)- Polling-based file detection for NFS 4.2 mounts
- Automatic tagging with MD5 and MIME type
- Folder-based batch processing
- Kubernetes job deployment for PDF ingestion
- Milvus collection management
-
๐ File Monitor Daemon (
src/file_monitor_daemon.py)- Standalone daemon for running file monitor
- Background service for continuous monitoring
- Logging and error handling
-
Mount Refresh Script (
refresh_mounts.sh)- Unmounts and remounts Hammerspace NFS shares
- Resolves stale file handle errors
- Supports selective mount refresh
Data Flow
User Input (NL) โ Claude AI โ MCP Tool Selection โ HSTK CLI โ Hammerspace
โ
MCP Server
โ
Real Operations
๐ File Ingestion Data Flow
New File/Folder โ File Monitor โ Tagging โ Kubernetes Job โ PDF Processing โ Milvus Collection
โ
Event Logging โ Web UI Dashboard โ Real-time Monitoring
๐งช Testing
Web UI Tests
cd tests
./test_web_ui_curl.sh
python test_web_ui.py
Single Command Test
cd tests
./test_single_question.sh "Tag all files in /modelstore/test as modelsetid=test123"
๐ File Ingestion Tests
# Start file monitor daemon
python3 src/file_monitor_daemon.py
# Check monitor status
curl http://localhost:5000/api/monitor/status
# View ingestion events
curl http://localhost:5000/api/monitor/events
# Test with new files
cp test.pdf /mnt/anvil/hub/
๐ ๏ธ Service Management
Unified Service Management
# Start all services with one command
./start_all_services.sh start
# Check service status
./start_all_services.sh status
# Stop all services
./start_all_services.sh stop
# Restart all services
./start_all_services.sh restart
Individual Service Management
# View logs for specific services
./start_all_services.sh logs web-ui
./start_all_services.sh logs file-monitor
./start_all_services.sh logs hammerspace-mcp
./start_all_services.sh logs milvus-mcp
# Attach to service screen sessions
./start_all_services.sh attach web-ui
./start_all_services.sh attach file-monitor
# Clean up old logs
./start_all_services.sh cleanup
Service URLs
Once started, services are available at:
- Web UI: http://localhost:5000
- Web UI (LAN): http://10.0.0.236:5000
- Hammerspace MCP: stdio-based (no HTTP endpoint)
- Milvus MCP: http://localhost:9902/sse (if Milvus is running)
- Kubernetes MCP: http://localhost:9903/sse (not implemented)
View Logs
# Web UI logs
tail -f logs/web_ui.log
# MCP Server logs
tail -f logs/aiq_hstk_mcp.log
# File monitor logs
tail -f logs/retroactive_fully_disabled.log
# Real-time streaming (in browser)
# Visit http://localhost:5000/debug
๐ Troubleshooting
Stale File Handle Errors
The MCP server automatically detects and retries operations when stale file handles occur. If issues persist:
# Manual mount refresh
./refresh_mounts.sh
# Or via Web UI
"Refresh Hammerspace mounts"
Tag Operations Not Working
- Ensure you're targeting a mounted directory
- Verify the HSTK CLI is accessible:
/home/mike/hs-mcp-1.0/.venv/bin/hs --version - Check mounts:
mount | grep hammerspace
๐ File Ingestion Issues
-
Files not detected: Check if file monitor is running
ps aux | grep file_monitor_daemon -
Kubernetes jobs failing: Check pod status
kubectl get pods -l app=pdf-ingest kubectl logs -l app=pdf-ingest -
Empty Milvus collections: Check ingest service logs
kubectl logs -l app=ingestor-server -
NFS mount issues: Verify mount points
mount | grep anvil ls -la /mnt/anvil/hub/
Common Tag Formats
Tags in Hammerspace use the format: namespace.key=value
Examples:
user.modelsetid=my-demouser.project=gtc-2025user.tier=critical
๐ Documentation
๐ Quick Start
- Quick Reference Guide - Your go-to guide for common operations
- Service Management - Complete service management guide
๐๏ธ Architecture & API
- Architecture Documentation - Comprehensive system architecture
- API Documentation - Complete API reference for all MCP servers
๐ Integration Guides
- Integration Guide - Connect to Cursor, Windsurf, NVIDIA Playground
๐ Feature Guides
- Complete Feature Documentation
- Tag Search Guide
- Tier Management Guide
- Tag to Objectives Workflow
- Guides Index
๐งช Testing Guides
๐ Production Deployment
Security Considerations
- Run the Web UI behind a reverse proxy (nginx, Apache)
- Use HTTPS for production deployments
- Restrict access with authentication
- Review and limit MCP tool permissions
Performance Tuning
- Adjust
max_files_to_checkfor large filesystems (default: 500) - Use specific
share_pathparameters to limit search scope - Monitor log files for performance issues
๐ Support
For support and questions:
- Create an issue in the GitHub repository
- Check the documentation in the
docs/folder - Review the troubleshooting section above
๐ Recent Updates
๐ October 23, 2025 - Advanced Service Management + Multi-Format Support
- โ
NEW: Unified service management script (
start_all_services.sh) for all MCP services - โ NEW: Screen session persistence - services survive SSH disconnections
- โ NEW: Auto-restart capability for crashed services
- โ NEW: Systemd integration for auto-start on boot
- โ NEW: Multi-format file support (BMP, DOCX, HTML, JPEG, JSON, MD, PDF, PNG, PPTX, SH, TIFF, TXT, MP3)
- โ NEW: Tag-based tier management with automatic tier0 promotion/demotion
- โ NEW: Recursive folder tagging for 40x performance improvement
- โ NEW: Enhanced event filtering and monitoring UI
- โ NEW: Comprehensive health monitoring and port conflict resolution
- โ FIXED: Unicode handling in logs and CLI output
- โ FIXED: NFS timing issues with retry mechanisms
- โ FIXED: Hammerspace CLI tag operations with fallback methods
- โ FIXED: Individual file vs folder-level tagging optimization
๐ October 16, 2025 - Automatic File Ingestion + Kubernetes + MCP
- โ NEW: Complete automatic file ingestion system with Kubernetes integration
- โ NEW: Polling-based file monitoring for NFS 4.2 mounts (inotify not supported)
- โ NEW: Folder-based batch processing - new folders trigger collection creation
- โ NEW: Milvus vector database integration for PDF embeddings
- โ NEW: Kubernetes job deployment with ConfigMaps for file processing
- โ NEW: Intelligent collection naming (folder-based and sequential)
- โ NEW: Real-time event streaming and monitoring dashboard
- โ NEW: Standalone file monitor daemon for continuous operation
- โ FIXED: NFS compatibility issues with polling-only approach
- โ FIXED: Kubernetes PVC integration and path mapping
- โ FIXED: Milvus collection naming conventions (underscores vs hyphens)
- โ FIXED: File path handling in container environments
- โ FIXED: Job completion tracking and error handling
- โ FIXED: Retroactive tagging time window configuration
- โ FIXED: Threading and async operation coordination
October 9, 2025 - File Ingestion Event System & Monitoring UI
- โ
NEW: Agentic event consumption via
get_file_ingest_eventsMCP tool - โ
NEW: Real-time monitoring dashboard at
/monitorwith live event streaming - โ NEW: API endpoints for event querying and monitoring
- โ FIXED: Duplicate file processing - files now tagged exactly once
- โ FIXED: In-memory tracking prevents repeated tagging of same files
- โ
Deduplication verification tools:
check_duplicates.shandfind-dup.sh - โ Full event metadata: timestamp, file path, MD5 hash, MIME type, file size
- โ Event filtering: by type, file pattern, or timestamp
- โ Toast notifications for new file arrivals
- โ Automated file monitoring service with MD5/MIME tagging
Previous Updates
- โ Added automatic mount refresh on stale file handle errors
- โ Suppressed Anthropic API deprecation warnings in logs
- โ Added copy buttons to all chat messages
- โ Fixed PARTIALLY ALIGNED status detection
- โ Web-based natural language console with Claude AI
MCP 1.5 - Production-ready Hammerspace storage management with natural language interface and automatic file ingestion.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
E2B
Using MCP to run code via e2b.