Dataiku DSS MCP Server

Dataiku DSS MCP Server

A comprehensive MCP server for Dataiku DSS integration, providing Claude Code with direct access to manage recipes, datasets, and scenarios.

Category
Visit Server

README

Dataiku DSS MCP Server

A comprehensive Model Context Protocol (MCP) server for Dataiku DSS integration. This project provides Claude Code with direct access to Dataiku DSS for managing recipes, datasets, and scenarios.

šŸš€ Quick Start

Prerequisites

  • Node.js 18.0.0+
  • Dataiku DSS instance with API access
  • Valid DSS API key

Installation

# Install globally
npm install -g @zhangzichao2008/mcp-dataiku

# Or use with npx
npx @zhangzichao2008/mcp-dataiku

Configuration

  1. Copy environment template:
cp .env.sample .env
  1. Configure your DSS connection in .env:
DSS_HOST=https://your-dss-instance.com:10000
DSS_API_KEY=your-api-key-here
DSS_INSECURE_TLS=true  # Only if using self-signed certificates

Claude Code Integration

Register the MCP server with Claude Code:

claude mcp add dataiku-dss \
    -e DSS_HOST=https://your-dss-instance.com:10000 \
    -e DSS_API_KEY=your-api-key-here \
    -e DSS_INSECURE_TLS=true \
    -- npx @zhangzichao2008/mcp-dataiku

šŸ“š MCP Tool Catalog

Core Recipe Management Tools

Tool Description Key Parameters
create_recipe Create new recipe project_key, recipe_type, recipe_name, inputs, outputs, code
update_recipe Update existing recipe project_key, recipe_name, **kwargs
delete_recipe Delete recipe project_key, recipe_name
run_recipe Execute recipe project_key, recipe_name, build_mode

Core Dataset Management Tools

Tool Description Key Parameters
create_dataset Create new dataset project_key, dataset_name, dataset_type, params
update_dataset Update dataset settings project_key, dataset_name, **kwargs
delete_dataset Delete dataset project_key, dataset_name, drop_data
build_dataset Build dataset project_key, dataset_name, mode, partition
inspect_dataset_schema Get dataset schema project_key, dataset_name
check_dataset_metrics Get dataset metrics project_key, dataset_name

Core Scenario Management Tools

Tool Description Key Parameters
create_scenario Create new scenario project_key, scenario_name, scenario_type, definition
update_scenario Update scenario settings project_key, scenario_id, **kwargs
delete_scenario Delete scenario project_key, scenario_id
run_scenario Execute scenario project_key, scenario_id

šŸ”§ Advanced Tools

Tool Description Key Parameters
get_scenario_logs Get detailed run logs and error messages project_key, scenario_id, run_id
get_recipe_code Extract actual Python/SQL code from recipes project_key, recipe_name
get_project_flow Get complete data flow/pipeline structure project_key
get_dataset_sample Get sample data from datasets project_key, dataset_name, rows, columns
get_recent_runs Get recent run history across scenarios/recipes project_key, limit, status_filter
list_projects List all available Dataiku projects -

Additional Dataset Tools

Tool Description Key Parameters
list_datasets List all datasets in a project project_key, dataset_type (optional)
get_dataset_info Get detailed information about a dataset project_key, dataset_name
clear_dataset Clear data from a dataset project_key, dataset_name, partition (optional)

Additional Recipe Tools

Tool Description Key Parameters
list_recipes List all recipes in a project project_key, recipe_type (optional)
get_recipe_info Get detailed information about a recipe project_key, recipe_name
validate_recipe_syntax Validate Python/SQL syntax of a recipe project_key, recipe_name, code (optional)
test_recipe_dry_run Test recipe logic without actual execution project_key, recipe_name, sample_rows

Additional Scenario Tools

Tool Description Key Parameters
list_scenarios List all scenarios in a project project_key, scenario_type, active_only
get_scenario_info Get detailed information about a scenario project_key, scenario_id
add_scenario_trigger Add a trigger to a scenario project_key, scenario_id, trigger_type, trigger params
remove_scenario_trigger Remove a trigger from a scenario project_key, scenario_id, trigger_idx
get_scenario_run_history Get run history for a scenario project_key, scenario_id, limit
get_scenario_steps Get step configuration including Python code project_key, scenario_id
clone_scenario Clone an existing scenario with modifications project_key, source_scenario_id, new_scenario_name, modifications

Advanced Tools

Tool Description Key Parameters
search_project_objects Search for datasets, recipes, scenarios by name/pattern project_key, search_term, object_types
get_code_environments List available Python/R environments project_key (optional)
get_project_variables Get project-level variables and configuration project_key
get_connections List available data connections project_key (optional)
get_job_details Get detailed job execution information project_key, job_id
cancel_running_jobs Cancel running jobs/scenarios project_key, job_ids
batch_update_objects Update multiple objects with similar changes project_key, object_type, pattern, updates
get_project_flow Get complete data flow/pipeline structure project_key
export_project_config Export project configuration as JSON/YAML project_key, format
duplicate_project_structure Copy project structure to new project source_project_key, target_project_key, include_data

Total: 46 Tools

šŸ”§ Usage Examples

Core Operations

Creating a Python Recipe

{
  "project_key": "ANALYTICS_PROJECT",
  "recipe_type": "python",
  "recipe_name": "data_cleaner",
  "inputs": ["raw_data"],
  "outputs": [{"name": "clean_data", "new": true, "connection": "filesystem_managed"}],
  "code": "import pandas as pd\ndf = dataiku.Dataset(\"raw_data\").get_dataframe()\ndf_clean = df.dropna()\ndataiku.Dataset(\"clean_data\").write_with_schema(df_clean)"
}

Building a Dataset

{
  "project_key": "BI",
  "dataset_name": "user_analytics",
  "mode": "RECURSIVE_BUILD"
}

Getting Dataset Sample

{
  "project_key": "FINANCE_PROJECT",
  "dataset_name": "transactions",
  "rows": 500,
  "columns": ["customer_id", "amount"]
}

Getting Scenario Logs

{
  "project_key": "ANALYTICS_PROJECT",
  "scenario_id": "data_processing"
}

Exploring Project Structure

{
  "project_key": "SALES_ANALYTICS"
}

šŸ—ļø Architecture

mcp-dataiku/
ā”œā”€ā”€ src/
│   ā”œā”€ā”€ dataiku-client.ts   # Dataiku API client
│   └── mcp-server.ts       # MCP server implementation
ā”œā”€ā”€ package.json
ā”œā”€ā”€ tsconfig.json
ā”œā”€ā”€ .env.sample
└── README.md

šŸ”’ Security

  • API Key Protection: Store API keys in environment variables, never in code
  • SSL Configuration: Support for self-signed certificates with DSS_INSECURE_TLS=true
  • Permission Validation: All operations respect DSS user permissions
  • Error Handling: Sensitive information is not exposed in error messages

šŸ“ˆ Monitoring

The MCP server provides logging for monitoring:

# Check logs for debugging
tail -f dataiku_mcp.log

šŸ¤ Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit changes: git commit -m 'Add amazing feature'
  4. Push to branch: git push origin feature/amazing-feature
  5. Open a Pull Request

Development Setup

# Install dependencies
npm install

# Run in development mode
npm run dev

# Build for production
npm run build

# Run basic validation tests (no actual API calls)
npm test

# Run comprehensive tests (requires Dataiku DSS)
node test-comprehensive.js

# Clean build artifacts
npm run clean

# Publish new version (patch version)
npm run publish:patch

šŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

šŸ”— Links

šŸ“ž Support

If you encounter any issues or have questions, please open an issue on GitHub.


Ready to enhance your Dataiku workflows with AI assistance! šŸš€

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured