Datris MCP Server

Datris MCP Server

MCP server with 32 tools for ETL ingestion, AI-generated data quality rules, AI transformations, vector search, and natural-language SQL. Works across Postgres, MongoDB, Kafka, S3/MinIO, HashiCorp Vault, and five vector stores (Qdrant, Weaviate, Milvus, Chroma, pgvector).

Category
Visit Server

README

Datris — The First AI Agent-Native Data Platform

Try Hosted Free

PyPI MCP Registry Docker Hub License

datris.ai · Try Hosted Free · Documentation · MCP Registry · PyPI

Ingest, validate, transform, store, and retrieve your data — whether you're an AI agent talking through MCP or a developer writing config. One platform for both.

Why Datris?

  • Agent-native — Built-in MCP server with 35+ tools. Claude, Cursor, OpenClaw, and any MCP-compatible agent can operate pipelines through natural conversation
  • Taps — AI-generated Python scripts that fetch data from external sources (APIs, web scraping, databases) and push it into pipelines. Describe what you want, Datris generates the script. Includes AI diagnosis, CRON scheduling, and credentials via Vault
  • AI at every stage — AI data quality, AI transformations, AI schema generation, AI profiling, AI error explanation, natural language queries, RAG
  • No vendor lock-in — 100% open-source infrastructure (MinIO, PostgreSQL, MongoDB, Kafka, Vault). Runs anywhere Docker does
  • Configuration-driven — Define pipelines through JSON. No code required

Quick Start

git clone https://github.com/datris/datris-platform-oss.git
cd datris-platform-oss
cp .env.example .env       # Add your ANTHROPIC_API_KEY and/or OPENAI_API_KEY
docker compose up -d

UI: http://localhost:4200 · API: http://localhost:8080

Connect an AI Agent

Add to your MCP client config (Claude Desktop, Cursor, etc.):

{
  "mcpServers": {
    "datris": {
      "command": "uvx",
      "args": ["datris-mcp-server"],
      "env": {
        "PIPELINE_URL": "http://localhost:8080"
      }
    }
  }
}

CLI

brew tap datris/tap
brew install datris
datris ingest data.csv --dest postgres
datris ingest sales.csv --ai-validate "prices > 0" --ai-transform "convert dates to YYYY/MM/DD"
datris query "SELECT * FROM sales"
datris search "quarterly revenue" --store pgvector
datris tap create "Fetch S&P 500 daily prices from yfinance" --pipeline stocks
datris taps

What It Does

Source (File Upload / MinIO Event / Database Pull / Kafka)
  → Preprocessor (optional REST endpoint)
  → Data Quality (AI rules, header validation, schema validation)
  → Transformation (AI transformation, destination schema)
  → Destinations (in parallel):
      PostgreSQL, MongoDB, MinIO (Parquet/ORC), Kafka, ActiveMQ,
      REST Endpoint, Qdrant, Weaviate, Milvus, Chroma, pgvector
  → Notifications (ActiveMQ topic)

AI-Powered Features

Feature Description
MCP Server 30+ tools for AI agents — pipeline CRUD, upload, query, search, profiling
AI Data Quality Plain English validation rules — AI generates and runs a validation script
AI Transformation Plain English transformations — AI generates and runs a transformation script
AI Schema Generation Upload a file, get a complete pipeline config
AI Data Profiling Upload a file, get statistics + suggested validation rules
AI Error Explanation Job failures explained in plain English
Natural Language Query Ask questions in English, get SQL results
RAG Pipeline Chunk, embed, and search across 5 vector databases

Supported Formats

CSV, JSON, XML, Excel, PDF, Word, PowerPoint, HTML, email, EPUB, plain text, .zip/.tar/.gz archives

AI Providers

Anthropic Claude (Opus 4.6, Sonnet 4.6, Haiku) · OpenAI (GPT-5, GPT-4.1, o3) · Ollama (local models)

Architecture

Service Purpose
MinIO S3-compatible object store for file staging and data output
MongoDB Configuration store, job status tracking, metadata
ActiveMQ File notification queue, pipeline event notifications
HashiCorp Vault Secrets management (database credentials, API keys)
Apache Kafka Optional streaming source and destination
Apache Spark Local Spark for writing Parquet/ORC to MinIO

Documentation

Full documentation at docs.datris.ai or locally at docs/.

License

AGPL-3.0

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured