sql-steward

sql-steward

A governed SQL gateway that exposes typed tools to AI agents, compiling safe read-only queries from a semantic layer while blocking PII before execution, supporting SQL Server, Postgres, and SQLite.

Category
Visit Server

README

sql-steward

License Python

Part of the Governed Agent Stack: free, on-prem building blocks for an AI agent you can point at a real database and audit.

A governed SQL gateway for AI agents, exposed over the Model Context Protocol. The agent never gets a connection string and never writes SQL. It calls typed tools; sql-steward compiles every query from a semantic layer you control, refuses blocked PII before the query runs, and returns rows. Same tools across SQL Server, Postgres and SQLite.

Most SQL MCP servers hand the model a run_sql tool and try to catch the bad queries on the way out. sql-steward removes the tool. There is no path from a prompt to raw SQL at your database, because the only thing the agent can do is name an entity or a metric and pick from allow-lists you wrote.

Three guarantees

  1. Read-only by construction. There is no run_sql, query, or execute tool. The compiler can only ever build a SELECT, so a write isn't blocked, it's unrepresentable.
  2. PII refused before retrieval. Every field can carry a PII tag. If a request touches a category your policy blocks, sql-steward refuses with a structured reason before any SQL is compiled or run.
  3. Auditable. Every call, refusal and error can be recorded in a tamper-evident, hash-chained log via agent-blackbox, with audit-verify to prove nothing was rewritten.

See it in 10 seconds

pip install sql-steward      # or: pipx install sql-steward
sql-steward demo            # zero config, no API key, no agent, SQLite
1) get_metric('mrr_total', dimensions=['plan'])  -> safe aggregate
   compiled: SELECT subscriptions.plan, SUM(subscriptions.mrr) AS mrr_total
             FROM subscriptions GROUP BY subscriptions.plan LIMIT 1000
   {'plan': 'pro', 'mrr_total': 297.0}
   {'plan': 'team', 'mrr_total': 598.0}

2) get_metric('mrr_total', dimensions=['customers.country'])  -> auto-join
   compiled: ... INNER JOIN customers ON subscriptions.customer_id = customers.id ...

3) get_records('customers', fields=['id','email'])  -> PII refusal
   refused: {"kind": "pii_blocked", "detail": "Field 'customers.email' is tagged
             EMAIL_ADDRESS, which this policy refuses."}

The semantic layer

This YAML is the entire contract between the agent and your database. Review it like code.

dialect: postgres

entities:
  customers:
    table: customers
    fields:
      id: {type: int}
      name: {type: text, pii: PERSON}
      email: {type: text, pii: EMAIL_ADDRESS}
      country: {type: text}
  subscriptions:
    table: subscriptions
    fields:
      customer_id: {type: int}
      plan: {type: text}
      mrr: {type: numeric}

joins:                              # nothing reachable that isn't listed here
  - left: subscriptions
    right: customers
    on: subscriptions.customer_id = customers.id

metrics:
  mrr_total:                        # the aggregation is fixed; the agent only
    entity: subscriptions           # chooses dimensions/filters from the lists
    aggregate: sum
    field: mrr
    dimensions_allowed: [plan, status, customers.country]
    filters_allowed: [status, customers.country]

policy:
  block_pii: [EMAIL_ADDRESS, CREDIT_CARD]
  max_rows: 1000

Ask for a join that isn't defined and you get unreachable_entity, not an invented relationship. Ask to group a metric by a dimension that isn't listed and you get dimension_not_allowed.

Tools exposed to the agent

Tool Purpose
list_entities() What can be read, plus the available metrics
describe_entity(entity) Fields, types and PII tags (blocked ones flagged)
list_metrics() Metrics and the dimensions/filters each allows
get_records(entity, fields, filters, order_by, limit) Read rows from one entity
get_metric(metric, dimensions, filters, limit) Compute a pre-approved aggregate
semantic_search(entity, query, k, filters) pgvector nearest-neighbour search over an entity's embedding column
audit_verify() Verify the tamper-evident audit chain

Filters are {field, op, value}; operators are =, !=, <, <=, >, >=, like, in, not in, is null, is not null. Values are always bound parameters, never inlined.

Wire it into an MCP client

servers.yaml lives wherever you point SQL_STEWARD_LAYER. Claude Desktop (claude_desktop_config.json):

{
  "mcpServers": {
    "sql-steward": {
      "command": "sql-steward",
      "env": {
        "SQL_STEWARD_LAYER": "/full/path/to/semantic.yaml",
        "SQL_STEWARD_DB_URL": "postgresql+psycopg://readonly@db.internal/warehouse"
      }
    }
  }
}

SQL_STEWARD_DB_URL is a SQLAlchemy URL, so the same server reads SQL Server (mssql+pyodbc://...), Postgres (postgresql+psycopg://...) or SQLite (sqlite:///path.db). Install the matching driver with the extras: pip install "sql-steward[postgres]" or "[mssql]".

Optional: the rest of the stack

The semantic layer is the primary control. These are extra layers, all opt-in, and no-ops if the library isn't installed:

pip install "sql-steward[rbac,mask,audit]"

export SQL_STEWARD_POLICY=/path/to/policy.yaml   # query-warden second-pass role check
export SQL_STEWARD_ROLE=analyst
export SQL_STEWARD_MASK=1                         # pii-veil masks anything left in results
export SQL_STEWARD_AUDIT_DB=logs/steward.db       # agent-blackbox audit chain (on if installed)
export SQL_STEWARD_QUERY_BUDGET=200               # hard cap on queries per role per session
export SQL_STEWARD_EMBED_URL=http://localhost:11434/api/embeddings  # local embeddings for semantic_search
export SQL_STEWARD_EMBED_MODEL=nomic-embed-text

Semantic search (pgvector)

Give an entity a search block pointing at a pgvector column and the agent gets a semantic_search tool, governed exactly like everything else (PII refused, results masked, calls audited):

entities:
  documents:
    table: documents
    fields:
      id: {type: int}
      title: {type: text}
      embedding: {type: vector}
    search:
      vector_column: embedding
      dim: 768
      returns: [id, title]

The query text is embedded locally (set SQL_STEWARD_EMBED_URL to a local Ollama endpoint, so nothing leaves the building), and matched with pgvector's <=> operator. PostgreSQL only. The embedding column is never returned.

  • query-warden re-checks the compiled SQL against a role policy.
  • pii-veil masks any PII that survives into result rows.
  • agent-blackbox records every call in a hash-chained ledger; sql-steward audit-verify checks it.

How this is different

A typical SQL MCP validates arbitrary SQL the model wrote (a blocklist: catch what's bad). sql-steward compiles SQL from definitions you wrote (an allow-list: only what's described exists). The read-only and PII guarantees hold by construction rather than by inspection, and the query surface is the same across three engines.

Develop

git clone https://github.com/Pawansingh3889/sql-steward
cd sql-steward
pip install -e ".[dev]"
pytest -q

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured