iceberg-mcp-server-hive

iceberg-mcp-server-hive

Provides read-only SQL access to Apache Iceberg tables via HiveServer2, enabling querying, schema discovery, and database listing on Cloudera Data Platform.

Category
Visit Server

README

Cloudera Iceberg MCP Server (via Hive)

Fork of cloudera/iceberg-mcp-server that uses Apache Hive (HiveServer2) instead of Impala for read-only access to Iceberg tables on CDP.

MCP Tools

Tool Description
execute_query(query) Run read-only SQL (SELECT, SHOW, DESCRIBE, WITH, EXPLAIN)
get_schema(database?) List tables in the configured or given database
list_databases() List all visible Hive databases
list_iceberg_snapshots(database, table) Snapshot history (db.table.HISTORY, with TBLPROPERTIES fallback)
list_iceberg_refs(database, table) Branches and tags (db.table.REFS)
create_iceberg_branch(...) Create branch from current state, snapshot ID, or timestamp
drop_iceberg_branch(...) Drop a branch
fast_forward_iceberg_branch(...) Fast-forward branch hierarchy
query_iceberg_branch(...) Read from db.table.branch_<name>
execute_iceberg_branch_dml(...) INSERT / UPDATE / DELETE on a branch

Iceberg branching (and tagging) is supported in Hive on CDP, not Impala. See branching and tagging.

Audit / write branch workflow

  1. list_iceberg_snapshots — pick a snapshot ID or timestamp
  2. list_iceberg_refs — inspect existing branches/tags
  3. create_iceberg_branch — fork an audit branch (FOR SYSTEM_VERSION or current head)
  4. query_iceberg_branch — read branch state
  5. execute_iceberg_branch_dml — write changes on the branch only
  6. fast_forward_iceberg_branch — advance a branch when ready
  7. drop_iceberg_branch — cleanup

Branch refs use lowercase branch_ prefix: mydb.mytable.branch_audit.

Configuration

Connection uses impyla against HiveServer2 (HTTP transport for CDP/Knox).

Example JDBC URL from CDP Data Warehouse:

jdbc:hive2://hs2-cdw-aw-se-hive.dw-se-sandbox-aws.a465-9q4k.cloudera.site/default;transportMode=http;httpPath=cliservice;ssl=true

Maps to MCP env vars:

JDBC / CDP Env var
Host in URL HIVE_HOST
Path after host (/default) HIVE_DATABASE
httpPath=cliservice HIVE_HTTP_PATH
transportMode=http HIVE_USE_HTTP_TRANSPORT=true
ssl=true HIVE_USE_SSL=true
Port (443 implied) HIVE_PORT=443
LDAP user/password HIVE_USER, HIVE_PASSWORD
Variable Default Description
HIVE_HOST HiveServer2 or Knox gateway host
HIVE_PORT 443 HS2 port (443 for Knox HTTP)
HIVE_USER LDAP / service user
HIVE_PASSWORD Password
HIVE_DATABASE default Default database for SHOW TABLES
HIVE_AUTH_MECHANISM LDAP impyla auth mechanism
HIVE_USE_HTTP_TRANSPORT true HTTP transport (typical on CDP)
HIVE_HTTP_PATH cliservice Knox / HS2 HTTP path
HIVE_USE_SSL true TLS
MCP_TRANSPORT stdio stdio, http, or sse

Claude Desktop / Agent Studio

{
  "mcpServers": {
    "iceberg-mcp-server-hive": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://github.com/<your-org>/iceberg-mcp-server-hive@main",
        "run-server"
      ],
      "env": {
        "HIVE_HOST": "hs2-your-cluster.example.cloudera.site",
        "HIVE_PORT": "443",
        "HIVE_USER": "username",
        "HIVE_PASSWORD": "password",
        "HIVE_DATABASE": "default"
      }
    }
  }
}

Local development

git clone https://github.com/<your-org>/iceberg-mcp-server-hive.git
cd iceberg-mcp-server-hive
uv sync --dev
export HIVE_HOST=... HIVE_USER=... HIVE_PASSWORD=...
uv run run-server

Differences from upstream (Impala)

  • Environment variables use HIVE_* instead of IMPALA_*
  • get_schema returns {database, tables} and accepts an optional database name
  • Added list_databases tool
  • execute_query returns {columns, rows} for SELECT results

Examples

See ./examples for LangChain and OpenAI SDK notebooks (update env vars from IMPALA_* to HIVE_*).

Copyright (c) 2025 - Cloudera, Inc. All rights reserved.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured