DSR Processor MCP Server
Enables conversational processing of Daily Status Reports (DSR) by extracting, validating, and storing data from JSON, DOCX, and XLSX files in IBM Cloud Object Storage. It provides a human-in-the-loop workflow for batch file processing and schema validation directly within MCP-compatible clients like Claude Desktop.
README
DSR Processor Agentic Agent
Complete implementation of DSR (Daily Status Report) processing using IBM watsonx Orchestrate ADK Agent with Agentic Workflow tools, plus MCP (Model Context Protocol) server for Claude Desktop integration.
Overview
This agent provides conversational, human-in-the-loop DSR processing capabilities for the C4I SOT system. It handles bulk file processing from Cloud Object Storage (COS), data extraction from multiple formats, schema validation, and automated storage of processed results.
Two deployment options:
- WxO ADK Agent - Deploy to IBM watsonx Orchestrate for enterprise workflows
- MCP Server - Use with Claude Desktop or other MCP clients for local AI assistance
Uses WxO Knowledge Base for schema management - The DSR schema is stored in WxO Knowledge Base as a .json.txt file, making it easy to update without redeploying tools.
Key Features
- Conversational Interface: Natural language interaction for DSR processing tasks
- LLM-Powered Extraction: Intelligent extraction from complex DOCX files using watsonx.ai, OpenAI, Groq, or Anthropic
- Multi-Format Support: Processes JSON, DOCX, and XLSX DSR files
- Batch Processing: Handle multiple files with single commands
- Schema Validation: Validates against C4I SOT DSR unified schema v1.2 from Knowledge Base
- Knowledge Base Integration: Schema stored in WxO Knowledge Base for easy updates
- Human Review: Optional review steps for quality control
- Error Handling: Graceful error handling with helpful suggestions
- COS Integration: Direct integration with IBM Cloud Object Storage
Architecture
Agentic Workflow Tools
The agent uses 5 custom tools that work together:
- list_cos_files - List and filter files in COS bucket
- download_cos_file - Download files from COS to processing area
- extract_dsr_data - Extract structured data from DSR files
- validate_schema - Validate data against DSR schema
- save_to_cos - Save processed data back to COS
Agent Flow
User Request
↓
Agent (LLM) interprets intent
↓
Agent selects appropriate tool(s)
↓
Tool executes and returns result
↓
Agent processes result and responds
↓
[Repeat for multi-step workflows]
Project Structure
dsr-agent-agentic/
├── README.md # This file
├── requirements.txt # Python dependencies (ADK)
├── requirements-mcp.txt # Python dependencies (MCP)
├── dsr-processor-agentic.yaml # Agent specification (ADK)
├── mcp_server.py # MCP server implementation
├── tools/ # Agentic Workflow tools
│ ├── list_cos_files.py # COS file listing
│ ├── download_cos_file.py # COS file download
│ ├── extract_dsr_data.py # Data extraction
│ ├── validate_schema.py # Schema validation
│ └── save_to_cos.py # COS file upload
├── docs/ # Documentation
│ ├── DEPLOYMENT-GUIDE.md # ADK deployment guide
│ ├── MCP-SERVER-GUIDE.md # MCP server guide
│ └── TESTING-GUIDE.md # Comprehensive testing
└── examples/ # Usage examples
└── EXAMPLE-CONVERSATIONS.md # Conversation examples
Quick Start
Option 1: MCP Server (Claude Desktop)
Prerequisites:
- Python 3.8+
- Claude Desktop or other MCP client
- IBM Cloud Object Storage credentials
Setup:
# Install dependencies
cd dsr-agent-agentic
pip install -r requirements-mcp.txt
# Configure environment variables
cp .env.example .env
# Edit .env with your COS credentials
# Add to Claude Desktop config
# See docs/MCP-SERVER-GUIDE.md for details
Usage: Open Claude Desktop and use natural language:
- "List all DSR files in Cloud Object Storage"
- "Download and process the USS VALOR DSR file"
- "Validate the latest DSR and save it"
See MCP-SERVER-GUIDE.md for complete setup instructions.
Option 2: WxO ADK Agent
Prerequisites:
- IBM Cloud account with watsonx Orchestrate TZ Essentials (trial)
- Cloud Object Storage instance with bucket created
- COS credentials (API key, instance CRN, endpoint, bucket name)
Note: This implementation uses the gpt-oss-120b-groq model (GPT-OSS 120B - OpenAI via Groq) which is available by default in WxO TZ Essentials. No additional watsonx.ai project setup is required.
Deployment:
-
Deploy Tools to WxO:
- Follow DEPLOYMENT-GUIDE.md for detailed steps
- Deploy each of the 5 tools to WxO UI
- Configure environment variables for COS access
-
Deploy Agent:
- Create agent in WxO AI assistant builder
- Use configuration from
dsr-processor-agentic.yaml - Connect all 5 tools to the agent
- Test and publish
-
Test:
- Follow TESTING-GUIDE.md
- Try example conversations from EXAMPLE-CONVERSATIONS.md
Basic Usage
User: "List all DSR files in COS"
Agent: [Lists files with details]
User: "Process the USS VALOR file from August 14"
Agent: [Downloads → Extracts → Validates → Saves]
User: "Process all JSON files from last week"
Agent: [Batch processes multiple files]
Features in Detail
LLM-Powered Extraction (NEW!)
Intelligent extraction from complex DOCX files using Large Language Models:
- Configurable Providers: watsonx.ai, OpenAI, Groq, or Anthropic
- Schema-Aware: Automatically maps extracted data to C4I SOT DSR schema
- Complex Structure Handling: Extracts multi-section documents (Event Info, Issue Tracker, OJT Hours, History of Effort)
- Data Normalization: Automatically formats dates, hull numbers, ship names
- Fallback Support: Falls back to basic extraction if LLM unavailable
See LLM-EXTRACTION-GUIDE.md for complete documentation.
Multi-Format Processing
- JSON: Direct parsing of structured DSR data
- DOCX: LLM-powered intelligent extraction (when enabled) or heuristic extraction
- XLSX: Cyber findings extraction from Excel spreadsheets
Schema Validation
Validates against C4I SOT DSR unified schema v1.2 from Knowledge Base:
- Knowledge Base Integration: Schema retrieved automatically from WxO Knowledge Base
- Easy Updates: Update schema without redeploying tools
- Required fields verification
- Data type checking
- Pattern matching (dates, GUIDs, hull numbers)
- Enum validation (status, enclave, issue types)
- Detailed error messages with fix suggestions
- Fallback to minimal schema if Knowledge Base unavailable
Batch Processing
Process multiple files with single commands:
- Filter by date range
- Filter by ship name
- Filter by file format
- Automatic error recovery
- Progress reporting
Human Review
Optional review workflows:
- Review before saving
- Review only on warnings
- Approve/reject/modify options
- Summary views for quick decisions
Environment Variables
Required for all tools:
# Cloud Object Storage
COS_API_KEY_ID=<your-cos-api-key>
COS_INSTANCE_CRN=<your-cos-instance-crn>
COS_ENDPOINT=https://s3.us-south.cloud-object-storage.appdomain.cloud
COS_BUCKET_NAME=dsr-files-in-cloud-object-storage-cos-standard-7q2
DOWNLOAD_DIR=/tmp/dsr-downloads
# Schema Configuration
DSR_SCHEMA_KB_NAME=C4I_SOT_DSR_unified.schema.v1_2.iso.json.txt
# LLM Configuration (Optional - for intelligent extraction)
LLM_ENABLED=true
LLM_PROVIDER=watsonx # Options: watsonx, openai, groq, anthropic
WATSONX_API_KEY=<your-watsonx-api-key>
WATSONX_PROJECT_ID=<your-watsonx-project-id>
Notes:
- The schema is retrieved from WxO Knowledge Base using
DSR_SCHEMA_KB_NAME - LLM extraction is optional but recommended for complex DOCX files
- See LLM-EXTRACTION-GUIDE.md for LLM configuration details
Dependencies
See requirements.txt for complete list:
ibm-cos-sdk- IBM Cloud Object Storage SDKjsonschema- JSON schema validationpython-docx- DOCX file processingopenpyxl- XLSX file processing
Documentation
- LLM-EXTRACTION-GUIDE.md - LLM-powered extraction setup and configuration
- MCP-SERVER-GUIDE.md - MCP server setup and usage
- DEPLOYMENT-GUIDE.md - WxO ADK agent deployment
- TESTING-GUIDE.md - Comprehensive testing guide
- EXAMPLE-CONVERSATIONS.md - Usage examples
Comparison with Skill Flows
This Agentic Workflow implementation differs from Skill Flows:
| Feature | Agentic Workflow | Skill Flows |
|---|---|---|
| Interaction | Conversational | Structured forms |
| Flexibility | High - natural language | Low - predefined paths |
| Human Review | Built-in, conversational | Requires explicit steps |
| Error Handling | Contextual suggestions | Fixed error messages |
| Batch Processing | Natural language commands | Requires loops/iteration |
| Learning Curve | Lower (natural language) | Higher (flow design) |
Use Cases
Daily Operations
- Process new DSR files as they arrive
- Validate and archive processed data
- Generate daily summaries
Batch Processing
- Process historical data
- Reprocess files after schema updates
- Bulk validation of existing files
Quality Control
- Review files before saving
- Identify validation issues
- Suggest corrections
Reporting
- List processed files
- Show processing statistics
- Identify problematic files
Troubleshooting
Common Issues
COS Connection Errors:
- Verify environment variables
- Check API key permissions
- Confirm endpoint URL
Validation Failures:
- Review error messages
- Check schema requirements
- Verify data formats
Knowledge Base Access:
- Verify schema file uploaded to Knowledge Base
- Check file name:
C4I_SOT_DSR_unified.schema.v1_2.iso.json.txt - Ensure Knowledge Base connected to agent
- Verify
DSR_SCHEMA_KB_NAMEenvironment variable - Test with
action: schema_infoparameter
Tool Not Found:
- Ensure tools are published
- Check agent configuration
- Refresh agent
See DEPLOYMENT-GUIDE.md for detailed troubleshooting.
Support
For issues or questions:
- Check documentation in
docs/folder - Review example conversations
- Consult IBM watsonx Orchestrate documentation
- Contact C4I SOT team
Version History
-
v1.2.0 (2026-03-11) - LLM-Powered Extraction
- Added LLM-powered intelligent extraction for complex DOCX files
- Multi-provider support: watsonx.ai, OpenAI, Groq, Anthropic
- New
llm_client.pymodule for configurable LLM providers - Enhanced
extract_dsr_data.pywith LLM extraction capability - Comprehensive LLM-EXTRACTION-GUIDE.md documentation
- Updated requirements.txt with LLM client dependencies
- Added LLM configuration to .env.example
- Automatic schema-aware data mapping and normalization
- Fallback to basic extraction when LLM unavailable
-
v1.1.0 (2026-03-10) - MCP Server Implementation
- Added MCP (Model Context Protocol) server for Claude Desktop integration
- New
mcp_server.pyfor standalone MCP deployment - New
requirements-mcp.txtfor MCP dependencies - Comprehensive MCP-SERVER-GUIDE.md documentation
- Updated README with dual deployment options (MCP + WxO)
- Updated DEPLOYMENT-GUIDE.md to reference MCP option
-
v1.0.1 (2026-03-10) - Knowledge Base Integration
- Updated to use WxO Knowledge Base for schema storage
- Schema file in
.json.txtformat for Knowledge Base compatibility - Updated validate_schema tool to use
get_knowledge()function - Added Knowledge Base upload instructions
- Enhanced troubleshooting for Knowledge Base access
- Deprecated
DSR_SCHEMA_PATHin favor ofDSR_SCHEMA_KB_NAME
-
v1.0.0 (2026-03-10) - Initial release
- 5 Agentic Workflow tools
- Complete agent specification
- Comprehensive documentation
- Example conversations
License
MIT License - See LICENSE file for details
Authors
C4I SOT Team
Acknowledgments
- IBM watsonx Orchestrate team
- ADK Agent framework
- C4I SOT DSR schema contributors
For detailed deployment instructions, see DEPLOYMENT-GUIDE.md
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.