openmetadata-mcp-server
OpenMetadata MCP — 170 tools for metadata, lineage, search, data quality, and OM 1.12+ Data Contracts. Includes lineage-impact and quality-rollup aggregations.
README
OpenMetadata MCP Server
The OpenMetadata MCP that ships full CRUD across every entity type — including OM 1.12+ Data Contracts, Metrics, Search Index, API Collections, and API Endpoints that the embedded MCP doesn't cover yet.
170 tools, 4 workflow Prompts (lineage impact / DQ investigation / glossary bootstrap / owner reassign), 7 MCP Resources, and aggregations like
lineage-impact(downstream blast-radius w/ owner notification list),quality-rollup(DQ status across a scope), andget-domain-summary(domain + 6 child entity types in one call).
What it does that others don't
- OM 1.12+ entity coverage — Data Contracts, Metrics, Search Index, API Collections, API Endpoints (10 read tools). Not in the embedded MCP yet.
- Aggregation tools —
lineage-impactanswers "what breaks if I change/drop X?" by walking lineage + counting consumers + breaking down by entity type + resolving the owner union for change-mgmt notifications, in one call.get-domain-summaryreturns domain + 6 child entity types via/search/querywithtrack_total_hitsin one call instead of 7 sequential.get-table-summaryfolds table + lineage + sample-data + DQ similarly. - Semantic search —
semantic-searchover OM 1.12+ vector index (POST/search/vector/query). Useful when keyword search misses synonyms. - MCP Prompts (4) —
lineage-impact-analysis,data-quality-investigation,glossary-term-bootstrap,owner-change-propagation. Workflow templates the model invokes directly. - MCP Resources (7) —
om://table/{fqn},om://glossary-term/{fqn},om://lineage/{type}/{fqn},om://search/{query},om://dashboard/{fqn},om://pipeline/{fqn},om://schema/{fqn}. - Token-efficient by design —
extractFieldsprojection on 28 read tools (dropschangeDescription/version/updatedBy/hrefnoise — ~80% size reduction),OM_TOOLS/OM_DISABLE9 categories,search-toolsmeta-tool. - Apps SDK card —
lineage-impactrenders as a blast-radius card on ChatGPT clients (downstream/upstream counts + type breakdown + top consumers + owners-to-notify) via_meta["openai/outputTemplate"]. Claude clients receive the same JSON content. - stdio + Streamable HTTP — defaults to stdio. Set
MCP_TRANSPORT=httpfor ChatGPT Apps SDK or remote clients (Bearer auth viaMCP_HTTP_TOKEN).
Try this — 5 prompts
Connect the server to Claude Desktop or Claude Code, then paste any of these:
- Lineage impact — "The
payments.transactionstable is being deprecated. List every dashboard, pipeline, and ML model that depends on it (upstream + downstream, depth 3)." - Data quality investigation — "Show all failing test cases from the last 7 days. Group by table, then by test type, with pass/fail counts."
- Glossary bootstrap — "Create a
paymentsglossary with these 8 terms: chargeback, refund, settlement, KYC, AML, transaction, customer-id, payment-method. Link related terms." - Owner reassign — "User
taeheeis leaving. List every entity (table/dashboard/pipeline/ML model) where they are owner. Then reassign all of them to teamdata-platform." - Domain summary — "Summarize the
analyticsdomain: total tables/dashboards/pipelines/ML models, top 5 by recent updates, and the data products it owns."
When to use this vs OpenMetadata's embedded MCP
OpenMetadata 1.12+ ships an embedded MCP. They are complementary:
| OM 1.12 embedded MCP | @us-all/openmetadata-mcp (this) |
|
|---|---|---|
| Tool count | ~10 (search, glossary basics, lineage, DQ, RCA, semantic search) | 170 (full CRUD across all entity types) |
| OM 1.12+ entity types (Data Contracts/Metrics/Search Index/API) | partial | ✅ 10 read tools |
| Aggregation tools | ❌ | ✅ lineage-impact, get-domain-summary, get-table-summary |
| MCP Prompts | ❌ | ✅ 4 |
| MCP Resources | ❌ | ✅ 7 |
| Auth | OAuth2 / PAT, OM Authorization Engine (RBAC) | JWT bot token + write gate |
| Deployment | Embedded in OM server (marketplace install) | Standalone npm / Docker / npx |
| OM version | 1.12+ only | 1.x compatible |
| Best for | RBAC-aware AI agents, SSO orgs | Bulk CRUD, automation, sample-data, older OM clusters |
Use the embedded MCP for RBAC-aware governance with SSO. Use this server for bulk metadata operations, full entity CRUD parity, automation, and OM clusters older than 1.12.
Install
Claude Desktop
{
"mcpServers": {
"openmetadata": {
"command": "npx",
"args": ["-y", "@us-all/openmetadata-mcp"],
"env": {
"OPENMETADATA_HOST": "http://your-host:8585",
"OPENMETADATA_TOKEN": "<jwt-bot-token>"
}
}
}
}
Claude Code
claude mcp add openmetadata -s user \
-e OPENMETADATA_HOST=http://your-host:8585 \
-e OPENMETADATA_TOKEN=<jwt-bot-token> \
-- npx -y @us-all/openmetadata-mcp
Docker
docker run --rm -i \
-e OPENMETADATA_HOST=http://your-host:8585 \
-e OPENMETADATA_TOKEN=<jwt-bot-token> \
ghcr.io/us-all/openmetadata-mcp-server
Build from source
git clone https://github.com/us-all/openmetadata-mcp-server.git
cd openmetadata-mcp-server && pnpm install && pnpm build
node dist/index.js
Get a token
- Open OpenMetadata UI → Settings → Bots
- Create a new bot or use an existing one (
ingestion-botworks) - Copy the JWT token
Configuration
| Variable | Required | Default | Description |
|---|---|---|---|
OPENMETADATA_HOST |
✅ | — | OpenMetadata server URL (e.g. http://localhost:8585) |
OPENMETADATA_TOKEN |
✅ | — | JWT or Bot token |
OPENMETADATA_ALLOW_WRITE |
❌ | false |
Set true to enable mutations (create/update/delete) |
OM_TOOLS |
❌ | — | Comma-sep allowlist of categories. Biggest token saver. |
OM_DISABLE |
❌ | — | Comma-sep denylist. Ignored when OM_TOOLS is set. |
MCP_TRANSPORT |
❌ | stdio |
http to enable Streamable HTTP transport |
MCP_HTTP_TOKEN |
conditional | — | Bearer token. Required when MCP_TRANSPORT=http |
MCP_HTTP_PORT |
❌ | 3000 |
HTTP listen port |
MCP_HTTP_HOST |
❌ | 127.0.0.1 |
HTTP bind host (DNS rebinding protection auto-enabled for localhost) |
MCP_HTTP_SKIP_AUTH |
❌ | false |
Skip Bearer auth — e.g. behind a reverse proxy that handles it |
Categories (9): search, core, discovery, governance, quality, services, admin, events, meta (always-on).
When MCP_TRANSPORT=http: POST /mcp (Bearer-auth JSON-RPC) + GET /health (public liveness).
Token efficiency
| Scenario | Tools | Schema tokens | vs default |
|---|---|---|---|
| default (all categories) | 156 | 24,000 | — |
typical (OM_TOOLS=search,core,governance,quality,discovery) |
120 | 19,500 | −19% |
narrow (OM_TOOLS=search,core) |
26 | 4,600 | −81% |
extractFields adds another ~80–90% reduction on individual responses (e.g. get-table 8KB → 200B with extractFields: "name,columns.*.name,columns.*.dataType"). Auto-applied across 28 read tools.
// without
get-table { "id": "..." }
// with
get-table { "id": "...", "extractFields": "name,description,columns.*.name,columns.*.dataType" }
MCP Prompts (4)
Workflow templates available via MCP prompts/list:
lineage-impact-analysis— given an entity, walk upstream + downstream lineage and rank by impact.data-quality-investigation— diff DQ test results across two windows; cluster failure modes.glossary-term-bootstrap— bulk-create a glossary with N related terms, link automatically.owner-change-propagation— find all entities owned by user X, propose batch reassignment.
MCP Resources
URI-based read-only access:
om://table/{fqn} (table + columns + owners + tags + joins), om://glossary-term/{fqn}, om://lineage/{type}/{fqn} (depth 3), om://search/{query} (top 10 keyword hits), om://dashboard/{fqn}, om://pipeline/{fqn} (with tasks), om://schema/{fqn}.
Tools (170)
9 categories. Use search-tools to discover at runtime; full list collapsed below.
| Category | Tools |
|---|---|
| Tables / Databases / Schemas / Lineage | 22 |
| Services (database/dashboard/messaging/pipeline/ml/storage) | 16 |
| Glossaries / Terms | 12 |
| Domains / Data Products | 12 |
| Classifications / Tags | 10 |
| Discovery (dashboards / pipelines / charts / topics / containers / ml-models) | 36 |
| Governance (roles / policies / users / teams / bots) | 13 |
| Quality (test suites / cases / sample data) | 13 |
| Stored Procedures / Queries | 11 |
| OM 1.12+ entities (Data Contract / Metric / Search Index / API Collection / API Endpoint) | 10 |
Search (search-metadata, suggest-metadata, semantic-search) |
3 |
Aggregations (lineage-impact, quality-rollup, get-domain-summary, get-table-summary) |
4 |
Quality (run-test-suite write-gated) |
1 |
Meta (search-tools) |
1 |
<details> <summary>Full tool list</summary>
Search (3)
search-metadata, suggest-metadata, semantic-search
Tables (6)
list-tables, get-table, get-table-by-name, create-table, update-table, delete-table
Databases (6)
list-databases, get-database, get-database-by-name, create-database, update-database, delete-database
Database Schemas (6)
list-schemas, get-schema, get-schema-by-name, create-schema, update-schema, delete-schema
Lineage (4)
get-lineage, get-lineage-by-name, add-lineage, delete-lineage
Services (16)
6 database-service tools + 2 each for dashboard/messaging/pipeline/ml-model/storage services.
Glossaries (12)
6 glossary CRUD + 6 glossary-term CRUD.
Dashboards / Pipelines / Topics / Charts / Containers / ML Models (36)
6 CRUD each, follows list / get / get-by-name / create / update / delete.
Classifications & Tags (10)
4 classification + 6 tag CRUD.
Domains & Data Products (12)
6 domain + 6 data-product CRUD.
Users & Teams (9)
3 user reads + 6 team CRUD.
Access Control (4)
list-roles, get-role, list-policies, get-policy
Data Quality (7)
list-test-suites, get-test-suite, get-test-suite-by-name, list-test-cases, get-test-case, get-test-case-by-name, list-test-case-results
Stored Procedures (6)
6 CRUD.
Queries (5)
list-queries, get-query, create-query, update-query, delete-query
Events (3)
list-events, get-event-subscription, get-event-subscription-by-name
Bots (3)
list-bots, get-bot, get-bot-by-name
Sample Data (6, read-only)
get-table-sample-data, get-table-sample-data-by-name, get-topic-sample-data, get-topic-sample-data-by-name, get-container-sample-data, get-container-sample-data-by-name
OM 1.12+ entities (10)
list-data-contracts, get-data-contract-by-name, list-metrics, get-metric-by-name, list-search-indexes, get-search-index-by-name, list-api-collections, get-api-collection-by-name, list-api-endpoints, get-api-endpoint-by-name
Aggregations
lineage-impact, quality-rollup, get-domain-summary, get-table-summary
Quality (write-gated)
run-test-suite — triggers the test-suite's associated ingestion pipeline. Async; results land via the normal pipeline flow.
Meta
search-tools — query other tools by keyword; always enabled.
</details>
Architecture
Claude → MCP stdio → src/index.ts → src/tools/*.ts → OpenMetadataClient (fetch) → OpenMetadata REST
Built on @us-all/mcp-toolkit:
extractFields— token-efficient response projectionsaggregate(fetchers, caveats)— fan-out helper used bylineage-impact/get-domain-summary/get-table-summarycreateWrapToolHandler—OPENMETADATA_TOKENredaction +OpenMetadataErrorextractionsearch-toolsmeta-tool
Targets OM 1.x. Validated against real OM backend with the OM 1.12+ entities.
Tech stack
Node.js 18+ • TypeScript strict ESM • pnpm • @modelcontextprotocol/sdk • zod • dotenv • vitest.
JSON-Patch updates handled automatically (PATCH application/json-patch+json content-type).
License
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.