ClickHouse MCP Agent

ClickHouse MCP Agent

Enables querying ClickHouse databases using natural language with AI models, supporting multiple providers and access restrictions via per-call allow-lists.

Category
Visit Server

README

ClickHouse MCP Agent

version

AI agent for ClickHouse database analysis via MCP (Model Context Protocol).

A single MCP server (mcp-clickhouse) driven by a single agent instance. Access restriction is performed via explicit allow-lists you pass per call (allowed_tables, allowed_databases), rather than managing multiple keys or fan-out across multiple agents.

Features

  • Query ClickHouse databases using natural language with AI models
  • Structured output: analysis, confidence, sql_used
  • Easy connection management (predefined or custom)
  • Conversational context with message-history pruning/summarization
  • No CLI or external .env required — configure at runtime
  • Access restriction via per-call allow-lists (allowed_tables, allowed_databases)
  • Streamable results via run_stream()
  • Persistent MCP server mode via async with ClickHouseAgent()
  • Typed exception hierarchy for reliable error handling

Supported Providers

  • OpenAI
  • Anthropic
  • Google Gemini
  • Groq
  • Mistral
  • Cohere

Local Development (Docker)

The fastest way to get started — no cloud ClickHouse account needed:

# Start ClickHouse with seeded demo data (orders + products)
docker compose up -d

# Install the package with dev dependencies
pip install -e ".[dev]"

# Run the examples (set your API key first)
GOOGLE_API_KEY=... python examples/example_minimal.py
GOOGLE_API_KEY=... python examples/example_stream.py

The docker/init.sql file seeds demo.orders (25 orders, May 2026) and demo.products (10 products across Electronics, Sports, Home, Books) automatically on first start.

Quickstart

import asyncio
from agent.clickhouse_agent import ClickHouseAgent
from agent.config import config

config.set_ai_model("openai:gpt-4o-mini")
config.set_model_api_key("openai", "your_api_key_here")
config.set_clickhouse(host="localhost", port="8123", user="default", password="", secure="false")

async def main():
    agent = ClickHouseAgent()
    result = await agent.run(
        allowed_tables=["orders", "products"],
        allowed_databases=["demo"],
        query="give me some insights on the recent data",
    )
    print("Analysis:", result.analysis)
    print("Confidence:", result.confidence)
    print("SQL used:", result.sql_used)

asyncio.run(main())

Persistent server (multiple queries)

Use the context manager to keep the MCP subprocess alive across calls — avoids subprocess startup overhead on every query:

async def main():
    async with ClickHouseAgent() as agent:
        r1 = await agent.run(query="how many orders were placed last week?")
        r2 = await agent.run(query="which products are selling fastest?", message_history=r1.messages)

Switching providers

All providers use the same interface — just swap the model string and key:

# Anthropic
config.set_ai_model("anthropic:claude-haiku-4-5-20251001")
config.set_model_api_key("anthropic", "your_key")

# Google
config.set_ai_model("google-gla:gemini-2.0-flash")
config.set_model_api_key("google", "your_key")

# Groq
config.set_ai_model("groq:llama-3.3-70b-versatile")
config.set_model_api_key("groq", "your_key")

Message History & Summarization

Pass message_history between calls for multi-turn conversations. When token usage exceeds summarize_config.token_limit, older messages are automatically summarized into a compact form by a separate summarizer agent.

summarize_config.set_token_limit(10000)
summarize_config.set_ai_model("gemini-2.5-flash")

Output

Each call to ClickHouseAgent.run() returns a RunResult:

Field Description
analysis Natural-language result text from the model
confidence Confidence level (1–10)
sql_used List of SQL strings executed during the run
messages Full (possibly pruned/summarized) message history
new_messages Only messages created in the latest turn
last_message The last message in the conversation
usage Token/usage statistics for the run

Error Handling

All errors raise from a typed hierarchy so you can catch at the right level:

from agent.exceptions import ClickHouseMCPError, MCPConnectionError, AgentExecutionError

try:
    result = await agent.run(query="...")
except MCPConnectionError:
    # MCP subprocess failed to start or connection dropped
    ...
except AgentExecutionError:
    # Agent logic failed during the run
    ...
except ClickHouseMCPError:
    # Any library error
    ...

Requirements

  • Python 3.10+
  • AI API key for your provider (OpenAI, Anthropic, Google/Gemini, Groq, Mistral, Cohere)

All dependencies are managed via pyproject.toml.

Roadmap

✅ Done

  • MCP integration via pydantic_ai.mcp.MCPServerStdio
  • SQL generation/execution via MCP tools
  • Schema inspection (databases/tables/columns)
  • Config-driven connections (playground/local/custom)
  • Access restriction via per-call allow-lists (allowed_tables, allowed_databases)
  • Runtime provider/model selection and API key management
  • Structured outputs (ClickHouseOutput) and RunResult with sql_used
  • Message history pruning/summarization
  • Streaming results via run_stream()
  • Persistent MCP server via async with ClickHouseAgent()
  • Typed exception hierarchy
  • Local development via Docker (docker compose up -d)
  • ruff linting, Python 3.13 support, CI hardened

⚙️ 0.11 — Agent API expansion

  • Async batch queries (parallel queries in one call)
  • ClickHouseAgent.reset() for lifecycle control without re-instantiation
  • structlog optional dep for structured observability
  • pydantic-ai API audit and model ref updates (Claude 4 / GPT family)

🔒 0.12 — Stable

  • API locked — no breaking changes without a major version
  • All known bugs resolved
  • py.typed check added to CI

🔭 Post-1.0 — Future

  • Database-agnostic abstraction (Elasticsearch, MongoDB, Postgres)
  • FastAPI standalone deployment option

Contributing

Open an issue or pull request for features or fixes.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured