sql-steward
A governed SQL gateway that exposes typed tools to AI agents, compiling safe read-only queries from a semantic layer while blocking PII before execution, supporting SQL Server, Postgres, and SQLite.
README
sql-steward
Part of the Governed Agent Stack: free, on-prem building blocks for an AI agent you can point at a real database and audit.
A governed SQL gateway for AI agents, exposed over the Model Context Protocol. The agent never gets a connection string and never writes SQL. It calls typed tools; sql-steward compiles every query from a semantic layer you control, refuses blocked PII before the query runs, and returns rows. Same tools across SQL Server, Postgres and SQLite.
Most SQL MCP servers hand the model a run_sql tool and try to catch the bad queries on the way out. sql-steward removes the tool. There is no path from a prompt to raw SQL at your database, because the only thing the agent can do is name an entity or a metric and pick from allow-lists you wrote.
Three guarantees
- Read-only by construction. There is no
run_sql,query, orexecutetool. The compiler can only ever build aSELECT, so a write isn't blocked, it's unrepresentable. - PII refused before retrieval. Every field can carry a PII tag. If a request touches a category your policy blocks, sql-steward refuses with a structured reason before any SQL is compiled or run.
- Auditable. Every call, refusal and error can be recorded in a tamper-evident, hash-chained log via agent-blackbox, with
audit-verifyto prove nothing was rewritten.
See it in 10 seconds
pip install sql-steward # or: pipx install sql-steward
sql-steward demo # zero config, no API key, no agent, SQLite
1) get_metric('mrr_total', dimensions=['plan']) -> safe aggregate
compiled: SELECT subscriptions.plan, SUM(subscriptions.mrr) AS mrr_total
FROM subscriptions GROUP BY subscriptions.plan LIMIT 1000
{'plan': 'pro', 'mrr_total': 297.0}
{'plan': 'team', 'mrr_total': 598.0}
2) get_metric('mrr_total', dimensions=['customers.country']) -> auto-join
compiled: ... INNER JOIN customers ON subscriptions.customer_id = customers.id ...
3) get_records('customers', fields=['id','email']) -> PII refusal
refused: {"kind": "pii_blocked", "detail": "Field 'customers.email' is tagged
EMAIL_ADDRESS, which this policy refuses."}
The semantic layer
This YAML is the entire contract between the agent and your database. Review it like code.
dialect: postgres
entities:
customers:
table: customers
fields:
id: {type: int}
name: {type: text, pii: PERSON}
email: {type: text, pii: EMAIL_ADDRESS}
country: {type: text}
subscriptions:
table: subscriptions
fields:
customer_id: {type: int}
plan: {type: text}
mrr: {type: numeric}
joins: # nothing reachable that isn't listed here
- left: subscriptions
right: customers
on: subscriptions.customer_id = customers.id
metrics:
mrr_total: # the aggregation is fixed; the agent only
entity: subscriptions # chooses dimensions/filters from the lists
aggregate: sum
field: mrr
dimensions_allowed: [plan, status, customers.country]
filters_allowed: [status, customers.country]
policy:
block_pii: [EMAIL_ADDRESS, CREDIT_CARD]
max_rows: 1000
Ask for a join that isn't defined and you get unreachable_entity, not an invented relationship. Ask to group a metric by a dimension that isn't listed and you get dimension_not_allowed.
Tools exposed to the agent
| Tool | Purpose |
|---|---|
list_entities() |
What can be read, plus the available metrics |
describe_entity(entity) |
Fields, types and PII tags (blocked ones flagged) |
list_metrics() |
Metrics and the dimensions/filters each allows |
get_records(entity, fields, filters, order_by, limit) |
Read rows from one entity |
get_metric(metric, dimensions, filters, limit) |
Compute a pre-approved aggregate |
semantic_search(entity, query, k, filters) |
pgvector nearest-neighbour search over an entity's embedding column |
audit_verify() |
Verify the tamper-evident audit chain |
Filters are {field, op, value}; operators are =, !=, <, <=, >, >=, like, in, not in, is null, is not null. Values are always bound parameters, never inlined.
Wire it into an MCP client
servers.yaml lives wherever you point SQL_STEWARD_LAYER. Claude Desktop (claude_desktop_config.json):
{
"mcpServers": {
"sql-steward": {
"command": "sql-steward",
"env": {
"SQL_STEWARD_LAYER": "/full/path/to/semantic.yaml",
"SQL_STEWARD_DB_URL": "postgresql+psycopg://readonly@db.internal/warehouse"
}
}
}
}
SQL_STEWARD_DB_URL is a SQLAlchemy URL, so the same server reads SQL Server (mssql+pyodbc://...), Postgres (postgresql+psycopg://...) or SQLite (sqlite:///path.db). Install the matching driver with the extras: pip install "sql-steward[postgres]" or "[mssql]".
Optional: the rest of the stack
The semantic layer is the primary control. These are extra layers, all opt-in, and no-ops if the library isn't installed:
pip install "sql-steward[rbac,mask,audit]"
export SQL_STEWARD_POLICY=/path/to/policy.yaml # query-warden second-pass role check
export SQL_STEWARD_ROLE=analyst
export SQL_STEWARD_MASK=1 # pii-veil masks anything left in results
export SQL_STEWARD_AUDIT_DB=logs/steward.db # agent-blackbox audit chain (on if installed)
export SQL_STEWARD_QUERY_BUDGET=200 # hard cap on queries per role per session
export SQL_STEWARD_EMBED_URL=http://localhost:11434/api/embeddings # local embeddings for semantic_search
export SQL_STEWARD_EMBED_MODEL=nomic-embed-text
Semantic search (pgvector)
Give an entity a search block pointing at a pgvector column and the agent gets a semantic_search tool, governed exactly like everything else (PII refused, results masked, calls audited):
entities:
documents:
table: documents
fields:
id: {type: int}
title: {type: text}
embedding: {type: vector}
search:
vector_column: embedding
dim: 768
returns: [id, title]
The query text is embedded locally (set SQL_STEWARD_EMBED_URL to a local Ollama endpoint, so nothing leaves the building), and matched with pgvector's <=> operator. PostgreSQL only. The embedding column is never returned.
- query-warden re-checks the compiled SQL against a role policy.
- pii-veil masks any PII that survives into result rows.
- agent-blackbox records every call in a hash-chained ledger;
sql-steward audit-verifychecks it.
How this is different
A typical SQL MCP validates arbitrary SQL the model wrote (a blocklist: catch what's bad). sql-steward compiles SQL from definitions you wrote (an allow-list: only what's described exists). The read-only and PII guarantees hold by construction rather than by inspection, and the query surface is the same across three engines.
Develop
git clone https://github.com/Pawansingh3889/sql-steward
cd sql-steward
pip install -e ".[dev]"
pytest -q
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.