Dataiku DSS MCP Server
A comprehensive MCP server for Dataiku DSS integration, providing Claude Code with direct access to manage recipes, datasets, and scenarios.
README
Dataiku DSS MCP Server
A comprehensive Model Context Protocol (MCP) server for Dataiku DSS integration. This project provides Claude Code with direct access to Dataiku DSS for managing recipes, datasets, and scenarios.
š Quick Start
Prerequisites
- Node.js 18.0.0+
- Dataiku DSS instance with API access
- Valid DSS API key
Installation
# Install globally
npm install -g @zhangzichao2008/mcp-dataiku
# Or use with npx
npx @zhangzichao2008/mcp-dataiku
Configuration
- Copy environment template:
cp .env.sample .env
- Configure your DSS connection in
.env:
DSS_HOST=https://your-dss-instance.com:10000
DSS_API_KEY=your-api-key-here
DSS_INSECURE_TLS=true # Only if using self-signed certificates
Claude Code Integration
Register the MCP server with Claude Code:
claude mcp add dataiku-dss \
-e DSS_HOST=https://your-dss-instance.com:10000 \
-e DSS_API_KEY=your-api-key-here \
-e DSS_INSECURE_TLS=true \
-- npx @zhangzichao2008/mcp-dataiku
š MCP Tool Catalog
Core Recipe Management Tools
| Tool | Description | Key Parameters |
|---|---|---|
create_recipe |
Create new recipe | project_key, recipe_type, recipe_name, inputs, outputs, code |
update_recipe |
Update existing recipe | project_key, recipe_name, **kwargs |
delete_recipe |
Delete recipe | project_key, recipe_name |
run_recipe |
Execute recipe | project_key, recipe_name, build_mode |
Core Dataset Management Tools
| Tool | Description | Key Parameters |
|---|---|---|
create_dataset |
Create new dataset | project_key, dataset_name, dataset_type, params |
update_dataset |
Update dataset settings | project_key, dataset_name, **kwargs |
delete_dataset |
Delete dataset | project_key, dataset_name, drop_data |
build_dataset |
Build dataset | project_key, dataset_name, mode, partition |
inspect_dataset_schema |
Get dataset schema | project_key, dataset_name |
check_dataset_metrics |
Get dataset metrics | project_key, dataset_name |
Core Scenario Management Tools
| Tool | Description | Key Parameters |
|---|---|---|
create_scenario |
Create new scenario | project_key, scenario_name, scenario_type, definition |
update_scenario |
Update scenario settings | project_key, scenario_id, **kwargs |
delete_scenario |
Delete scenario | project_key, scenario_id |
run_scenario |
Execute scenario | project_key, scenario_id |
š§ Advanced Tools
| Tool | Description | Key Parameters |
|---|---|---|
get_scenario_logs |
Get detailed run logs and error messages | project_key, scenario_id, run_id |
get_recipe_code |
Extract actual Python/SQL code from recipes | project_key, recipe_name |
get_project_flow |
Get complete data flow/pipeline structure | project_key |
get_dataset_sample |
Get sample data from datasets | project_key, dataset_name, rows, columns |
get_recent_runs |
Get recent run history across scenarios/recipes | project_key, limit, status_filter |
list_projects |
List all available Dataiku projects | - |
Additional Dataset Tools
| Tool | Description | Key Parameters |
|---|---|---|
list_datasets |
List all datasets in a project | project_key, dataset_type (optional) |
get_dataset_info |
Get detailed information about a dataset | project_key, dataset_name |
clear_dataset |
Clear data from a dataset | project_key, dataset_name, partition (optional) |
Additional Recipe Tools
| Tool | Description | Key Parameters |
|---|---|---|
list_recipes |
List all recipes in a project | project_key, recipe_type (optional) |
get_recipe_info |
Get detailed information about a recipe | project_key, recipe_name |
validate_recipe_syntax |
Validate Python/SQL syntax of a recipe | project_key, recipe_name, code (optional) |
test_recipe_dry_run |
Test recipe logic without actual execution | project_key, recipe_name, sample_rows |
Additional Scenario Tools
| Tool | Description | Key Parameters |
|---|---|---|
list_scenarios |
List all scenarios in a project | project_key, scenario_type, active_only |
get_scenario_info |
Get detailed information about a scenario | project_key, scenario_id |
add_scenario_trigger |
Add a trigger to a scenario | project_key, scenario_id, trigger_type, trigger params |
remove_scenario_trigger |
Remove a trigger from a scenario | project_key, scenario_id, trigger_idx |
get_scenario_run_history |
Get run history for a scenario | project_key, scenario_id, limit |
get_scenario_steps |
Get step configuration including Python code | project_key, scenario_id |
clone_scenario |
Clone an existing scenario with modifications | project_key, source_scenario_id, new_scenario_name, modifications |
Advanced Tools
| Tool | Description | Key Parameters |
|---|---|---|
search_project_objects |
Search for datasets, recipes, scenarios by name/pattern | project_key, search_term, object_types |
get_code_environments |
List available Python/R environments | project_key (optional) |
get_project_variables |
Get project-level variables and configuration | project_key |
get_connections |
List available data connections | project_key (optional) |
get_job_details |
Get detailed job execution information | project_key, job_id |
cancel_running_jobs |
Cancel running jobs/scenarios | project_key, job_ids |
batch_update_objects |
Update multiple objects with similar changes | project_key, object_type, pattern, updates |
get_project_flow |
Get complete data flow/pipeline structure | project_key |
export_project_config |
Export project configuration as JSON/YAML | project_key, format |
duplicate_project_structure |
Copy project structure to new project | source_project_key, target_project_key, include_data |
Total: 46 Tools
š§ Usage Examples
Core Operations
Creating a Python Recipe
{
"project_key": "ANALYTICS_PROJECT",
"recipe_type": "python",
"recipe_name": "data_cleaner",
"inputs": ["raw_data"],
"outputs": [{"name": "clean_data", "new": true, "connection": "filesystem_managed"}],
"code": "import pandas as pd\ndf = dataiku.Dataset(\"raw_data\").get_dataframe()\ndf_clean = df.dropna()\ndataiku.Dataset(\"clean_data\").write_with_schema(df_clean)"
}
Building a Dataset
{
"project_key": "BI",
"dataset_name": "user_analytics",
"mode": "RECURSIVE_BUILD"
}
Getting Dataset Sample
{
"project_key": "FINANCE_PROJECT",
"dataset_name": "transactions",
"rows": 500,
"columns": ["customer_id", "amount"]
}
Getting Scenario Logs
{
"project_key": "ANALYTICS_PROJECT",
"scenario_id": "data_processing"
}
Exploring Project Structure
{
"project_key": "SALES_ANALYTICS"
}
šļø Architecture
mcp-dataiku/
āāā src/
ā āāā dataiku-client.ts # Dataiku API client
ā āāā mcp-server.ts # MCP server implementation
āāā package.json
āāā tsconfig.json
āāā .env.sample
āāā README.md
š Security
- API Key Protection: Store API keys in environment variables, never in code
- SSL Configuration: Support for self-signed certificates with
DSS_INSECURE_TLS=true - Permission Validation: All operations respect DSS user permissions
- Error Handling: Sensitive information is not exposed in error messages
š Monitoring
The MCP server provides logging for monitoring:
# Check logs for debugging
tail -f dataiku_mcp.log
š¤ Contributing
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open a Pull Request
Development Setup
# Install dependencies
npm install
# Run in development mode
npm run dev
# Build for production
npm run build
# Run basic validation tests (no actual API calls)
npm test
# Run comprehensive tests (requires Dataiku DSS)
node test-comprehensive.js
# Clean build artifacts
npm run clean
# Publish new version (patch version)
npm run publish:patch
š License
This project is licensed under the MIT License - see the LICENSE file for details.
š Links
š Support
If you encounter any issues or have questions, please open an issue on GitHub.
Ready to enhance your Dataiku workflows with AI assistance! š
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.