openmetadata-mcp-server

openmetadata-mcp-server

OpenMetadata MCP — 170 tools for metadata, lineage, search, data quality, and OM 1.12+ Data Contracts. Includes lineage-impact and quality-rollup aggregations.

Category
Visit Server

README

OpenMetadata MCP Server

The OpenMetadata MCP that ships full CRUD across every entity type — including OM 1.12+ Data Contracts, Metrics, Search Index, API Collections, and API Endpoints that the embedded MCP doesn't cover yet.

170 tools, 4 workflow Prompts (lineage impact / DQ investigation / glossary bootstrap / owner reassign), 7 MCP Resources, and aggregations like lineage-impact (downstream blast-radius w/ owner notification list), quality-rollup (DQ status across a scope), and get-domain-summary (domain + 6 child entity types in one call).

npm downloads tools @us-all standard Glama MCP server

What it does that others don't

  • OM 1.12+ entity coverage — Data Contracts, Metrics, Search Index, API Collections, API Endpoints (10 read tools). Not in the embedded MCP yet.
  • Aggregation toolslineage-impact answers "what breaks if I change/drop X?" by walking lineage + counting consumers + breaking down by entity type + resolving the owner union for change-mgmt notifications, in one call. get-domain-summary returns domain + 6 child entity types via /search/query with track_total_hits in one call instead of 7 sequential. get-table-summary folds table + lineage + sample-data + DQ similarly.
  • Semantic searchsemantic-search over OM 1.12+ vector index (POST /search/vector/query). Useful when keyword search misses synonyms.
  • MCP Prompts (4) — lineage-impact-analysis, data-quality-investigation, glossary-term-bootstrap, owner-change-propagation. Workflow templates the model invokes directly.
  • MCP Resources (7) — om://table/{fqn}, om://glossary-term/{fqn}, om://lineage/{type}/{fqn}, om://search/{query}, om://dashboard/{fqn}, om://pipeline/{fqn}, om://schema/{fqn}.
  • Token-efficient by designextractFields projection on 28 read tools (drops changeDescription/version/updatedBy/href noise — ~80% size reduction), OM_TOOLS/OM_DISABLE 9 categories, search-tools meta-tool.
  • Apps SDK cardlineage-impact renders as a blast-radius card on ChatGPT clients (downstream/upstream counts + type breakdown + top consumers + owners-to-notify) via _meta["openai/outputTemplate"]. Claude clients receive the same JSON content.
  • stdio + Streamable HTTP — defaults to stdio. Set MCP_TRANSPORT=http for ChatGPT Apps SDK or remote clients (Bearer auth via MCP_HTTP_TOKEN).

Try this — 5 prompts

Connect the server to Claude Desktop or Claude Code, then paste any of these:

  1. Lineage impact"The payments.transactions table is being deprecated. List every dashboard, pipeline, and ML model that depends on it (upstream + downstream, depth 3)."
  2. Data quality investigation"Show all failing test cases from the last 7 days. Group by table, then by test type, with pass/fail counts."
  3. Glossary bootstrap"Create a payments glossary with these 8 terms: chargeback, refund, settlement, KYC, AML, transaction, customer-id, payment-method. Link related terms."
  4. Owner reassign"User taehee is leaving. List every entity (table/dashboard/pipeline/ML model) where they are owner. Then reassign all of them to team data-platform."
  5. Domain summary"Summarize the analytics domain: total tables/dashboards/pipelines/ML models, top 5 by recent updates, and the data products it owns."

When to use this vs OpenMetadata's embedded MCP

OpenMetadata 1.12+ ships an embedded MCP. They are complementary:

OM 1.12 embedded MCP @us-all/openmetadata-mcp (this)
Tool count ~10 (search, glossary basics, lineage, DQ, RCA, semantic search) 170 (full CRUD across all entity types)
OM 1.12+ entity types (Data Contracts/Metrics/Search Index/API) partial ✅ 10 read tools
Aggregation tools lineage-impact, get-domain-summary, get-table-summary
MCP Prompts ✅ 4
MCP Resources ✅ 7
Auth OAuth2 / PAT, OM Authorization Engine (RBAC) JWT bot token + write gate
Deployment Embedded in OM server (marketplace install) Standalone npm / Docker / npx
OM version 1.12+ only 1.x compatible
Best for RBAC-aware AI agents, SSO orgs Bulk CRUD, automation, sample-data, older OM clusters

Use the embedded MCP for RBAC-aware governance with SSO. Use this server for bulk metadata operations, full entity CRUD parity, automation, and OM clusters older than 1.12.

Install

Claude Desktop

{
  "mcpServers": {
    "openmetadata": {
      "command": "npx",
      "args": ["-y", "@us-all/openmetadata-mcp"],
      "env": {
        "OPENMETADATA_HOST": "http://your-host:8585",
        "OPENMETADATA_TOKEN": "<jwt-bot-token>"
      }
    }
  }
}

Claude Code

claude mcp add openmetadata -s user \
  -e OPENMETADATA_HOST=http://your-host:8585 \
  -e OPENMETADATA_TOKEN=<jwt-bot-token> \
  -- npx -y @us-all/openmetadata-mcp

Docker

docker run --rm -i \
  -e OPENMETADATA_HOST=http://your-host:8585 \
  -e OPENMETADATA_TOKEN=<jwt-bot-token> \
  ghcr.io/us-all/openmetadata-mcp-server

Build from source

git clone https://github.com/us-all/openmetadata-mcp-server.git
cd openmetadata-mcp-server && pnpm install && pnpm build
node dist/index.js

Get a token

  1. Open OpenMetadata UI → Settings → Bots
  2. Create a new bot or use an existing one (ingestion-bot works)
  3. Copy the JWT token

Configuration

Variable Required Default Description
OPENMETADATA_HOST OpenMetadata server URL (e.g. http://localhost:8585)
OPENMETADATA_TOKEN JWT or Bot token
OPENMETADATA_ALLOW_WRITE false Set true to enable mutations (create/update/delete)
OM_TOOLS Comma-sep allowlist of categories. Biggest token saver.
OM_DISABLE Comma-sep denylist. Ignored when OM_TOOLS is set.
MCP_TRANSPORT stdio http to enable Streamable HTTP transport
MCP_HTTP_TOKEN conditional Bearer token. Required when MCP_TRANSPORT=http
MCP_HTTP_PORT 3000 HTTP listen port
MCP_HTTP_HOST 127.0.0.1 HTTP bind host (DNS rebinding protection auto-enabled for localhost)
MCP_HTTP_SKIP_AUTH false Skip Bearer auth — e.g. behind a reverse proxy that handles it

Categories (9): search, core, discovery, governance, quality, services, admin, events, meta (always-on).

When MCP_TRANSPORT=http: POST /mcp (Bearer-auth JSON-RPC) + GET /health (public liveness).

Token efficiency

Scenario Tools Schema tokens vs default
default (all categories) 156 24,000
typical (OM_TOOLS=search,core,governance,quality,discovery) 120 19,500 −19%
narrow (OM_TOOLS=search,core) 26 4,600 −81%

extractFields adds another ~80–90% reduction on individual responses (e.g. get-table 8KB → 200B with extractFields: "name,columns.*.name,columns.*.dataType"). Auto-applied across 28 read tools.

// without
get-table { "id": "..." }

// with
get-table { "id": "...", "extractFields": "name,description,columns.*.name,columns.*.dataType" }

MCP Prompts (4)

Workflow templates available via MCP prompts/list:

  • lineage-impact-analysis — given an entity, walk upstream + downstream lineage and rank by impact.
  • data-quality-investigation — diff DQ test results across two windows; cluster failure modes.
  • glossary-term-bootstrap — bulk-create a glossary with N related terms, link automatically.
  • owner-change-propagation — find all entities owned by user X, propose batch reassignment.

MCP Resources

URI-based read-only access:

om://table/{fqn} (table + columns + owners + tags + joins), om://glossary-term/{fqn}, om://lineage/{type}/{fqn} (depth 3), om://search/{query} (top 10 keyword hits), om://dashboard/{fqn}, om://pipeline/{fqn} (with tasks), om://schema/{fqn}.

Tools (170)

9 categories. Use search-tools to discover at runtime; full list collapsed below.

Category Tools
Tables / Databases / Schemas / Lineage 22
Services (database/dashboard/messaging/pipeline/ml/storage) 16
Glossaries / Terms 12
Domains / Data Products 12
Classifications / Tags 10
Discovery (dashboards / pipelines / charts / topics / containers / ml-models) 36
Governance (roles / policies / users / teams / bots) 13
Quality (test suites / cases / sample data) 13
Stored Procedures / Queries 11
OM 1.12+ entities (Data Contract / Metric / Search Index / API Collection / API Endpoint) 10
Search (search-metadata, suggest-metadata, semantic-search) 3
Aggregations (lineage-impact, quality-rollup, get-domain-summary, get-table-summary) 4
Quality (run-test-suite write-gated) 1
Meta (search-tools) 1

<details> <summary>Full tool list</summary>

Search (3)

search-metadata, suggest-metadata, semantic-search

Tables (6)

list-tables, get-table, get-table-by-name, create-table, update-table, delete-table

Databases (6)

list-databases, get-database, get-database-by-name, create-database, update-database, delete-database

Database Schemas (6)

list-schemas, get-schema, get-schema-by-name, create-schema, update-schema, delete-schema

Lineage (4)

get-lineage, get-lineage-by-name, add-lineage, delete-lineage

Services (16)

6 database-service tools + 2 each for dashboard/messaging/pipeline/ml-model/storage services.

Glossaries (12)

6 glossary CRUD + 6 glossary-term CRUD.

Dashboards / Pipelines / Topics / Charts / Containers / ML Models (36)

6 CRUD each, follows list / get / get-by-name / create / update / delete.

Classifications & Tags (10)

4 classification + 6 tag CRUD.

Domains & Data Products (12)

6 domain + 6 data-product CRUD.

Users & Teams (9)

3 user reads + 6 team CRUD.

Access Control (4)

list-roles, get-role, list-policies, get-policy

Data Quality (7)

list-test-suites, get-test-suite, get-test-suite-by-name, list-test-cases, get-test-case, get-test-case-by-name, list-test-case-results

Stored Procedures (6)

6 CRUD.

Queries (5)

list-queries, get-query, create-query, update-query, delete-query

Events (3)

list-events, get-event-subscription, get-event-subscription-by-name

Bots (3)

list-bots, get-bot, get-bot-by-name

Sample Data (6, read-only)

get-table-sample-data, get-table-sample-data-by-name, get-topic-sample-data, get-topic-sample-data-by-name, get-container-sample-data, get-container-sample-data-by-name

OM 1.12+ entities (10)

list-data-contracts, get-data-contract-by-name, list-metrics, get-metric-by-name, list-search-indexes, get-search-index-by-name, list-api-collections, get-api-collection-by-name, list-api-endpoints, get-api-endpoint-by-name

Aggregations

lineage-impact, quality-rollup, get-domain-summary, get-table-summary

Quality (write-gated)

run-test-suite — triggers the test-suite's associated ingestion pipeline. Async; results land via the normal pipeline flow.

Meta

search-tools — query other tools by keyword; always enabled.

</details>

Architecture

Claude → MCP stdio → src/index.ts → src/tools/*.ts → OpenMetadataClient (fetch) → OpenMetadata REST

Built on @us-all/mcp-toolkit:

  • extractFields — token-efficient response projections
  • aggregate(fetchers, caveats) — fan-out helper used by lineage-impact / get-domain-summary / get-table-summary
  • createWrapToolHandlerOPENMETADATA_TOKEN redaction + OpenMetadataError extraction
  • search-tools meta-tool

Targets OM 1.x. Validated against real OM backend with the OM 1.12+ entities.

Tech stack

Node.js 18+ • TypeScript strict ESM • pnpm • @modelcontextprotocol/sdk • zod • dotenv • vitest.

JSON-Patch updates handled automatically (PATCH application/json-patch+json content-type).

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured