TalkDB

TalkDB

Enables natural language querying of databases with multi-turn conversations, auto-generated charts, and proactive monitoring via scheduled queries and alerts.

Category
Visit Server

README

TalkDB

An open-source, MCP-native autonomous data analyst. Converts natural-language questions into validated SQL, then goes further: multi-turn conversations, statistical insight generation, proactive metric monitoring, dual-path result verification, and a learning loop from user corrections.

Not just text-to-SQL — a full analyst loop in a single tool.

What it does

  • Ask questions in plain English over Postgres, SQLite, and more. Answers include SQL, results, a narrative summary, and an auto-generated chart.
  • Multi-turn conversations. "Revenue by month" → "just Q4" → "break that down by region" — follow-ups resolve to the previous turn's context.
  • Semantic layer (YAML). Define business metrics once (revenue = SUM(orders.total_amount) WHERE status='completed'); the LLM uses your definitions instead of guessing.
  • Dual-path verification. Every novel query is generated two structurally different ways; results are compared. Divergence drops confidence and surfaces a warning — catching semantic errors that schema validation can't.
  • Confidence scoring with graceful refusal. Queries below threshold aren't silently wrong — they return a refusal explaining what's uncertain.
  • Statistical insight agent. After results land, a pandas-based analyzer detects trends, anomalies, and concentrations. A chart is auto-generated from data shape. An LLM narrator writes 2–4 sentences using only the analyzer's facts (no hallucinated numbers).
  • Proactive watchdog. Save any query as a scheduled watch. APScheduler runs it on your cadence, compares to a rolling baseline, and fires a webhook/Slack/stdout alert when conditions trigger.
  • Self-improving via corrections. correct_query(question, wrong_sql, correct_sql) stores the pattern and indexes it into retrieval, so future similar questions benefit.
  • Community registry. talkdb registry install stripe-semantic drops in a full semantic model for Stripe's schema — metrics, join rules, and proven query patterns — so you don't start from zero on common SaaS databases.

Interfaces

  • MCP server (primary). Works with Claude Desktop, Cursor, VS Code, and any MCP-compatible client. 16 tools: ask, analyze, follow_up, list_databases, describe_database, validate_sql, correct_query, watch, list_watches, remove_watch, run_watch, get_session, install_semantic_package, uninstall_semantic_package, list_installed_packages, search_registry.
  • REST API. FastAPI wrapper with 17 endpoints mirroring every MCP tool — for web clients, dashboards, and environments where MCP isn't available. talkdb api --port 8000.
  • CLI. talkdb ask, talkdb chat, talkdb init, talkdb index, talkdb watchdog add/list/remove/run/start, talkdb registry install/uninstall/list/search, talkdb serve, talkdb api.

Tech stack

Python 3.11+ · FastMCP · FastAPI · LiteLLM (Claude / GPT / Gemini / Ollama) · SQLAlchemy 2 · sqlglot · ChromaDB + BM25 hybrid retrieval · pandas + matplotlib + seaborn · APScheduler · Pydantic.

Quick start

# Install from PyPI
pip install talkdb-ai

# ...or from source
git clone https://github.com/Nitin-Gupta1109/TalkDB.git
cd TalkDB
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

cp .env.example .env
# Edit .env: set OPENAI_API_KEY or ANTHROPIC_API_KEY, and TALKDB_DEFAULT_DB

# Seed a demo ecommerce DB (optional — or point at your own)
python scripts/seed_example_db.py

# Index schema + semantic model for retrieval
talkdb index

# Ask a question
talkdb ask "What is our total revenue?"

# Multi-turn chat
talkdb chat

# Start MCP server (for Claude Desktop / Cursor)
talkdb serve --transport stdio

Wiring into Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "talkdb": {
      "command": "/absolute/path/to/.venv/bin/talkdb",
      "args": ["serve", "--transport", "stdio"],
      "cwd": "/absolute/path/to/project"
    }
  }
}

Restart Claude Desktop. The ask, analyze, follow_up, etc. tools appear in the tool picker.

Semantic model

Define business meaning once in semantic_models/<db>.yaml:

metrics:
  - name: revenue
    description: "Total revenue from completed orders."
    calculation: "SUM(orders.total_amount) WHERE orders.status = 'completed'"
    table: orders

tables:
  - name: customers
    columns:
      - name: tier
        valid_values: ["bronze", "silver", "gold", "platinum"]
        aliases: ["loyalty level", "membership tier"]

joins:
  - left: orders
    right: customers
    on: "orders.customer_id = customers.id"
    type: "INNER JOIN"

Run talkdb init --database mydb --output semantic_models/mydb.yaml to auto-generate a skeleton from your DB.

Watchdog example

talkdb watchdog add \
  --name "Revenue monitor" \
  --question "What is today's total revenue?" \
  --schedule "every 1 hour" \
  --alert "drops more than 20% below 7-day average" \
  --slack-webhook "https://hooks.slack.com/services/..."

talkdb watchdog list
talkdb watchdog start   # blocks; runs the scheduler

Alerts render like:

🔴 Revenue monitor — Current value: $38.2k — 27% below baseline $52.4k (7_day_avg). Suggested follow-up: "Why is today's revenue below baseline?"

Community registry

Install community-maintained semantic packages:

talkdb registry install stripe-semantic     # from the registry (when published)
talkdb registry install ./packages/stripe-semantic   # from a local directory
talkdb registry list
talkdb registry search "stripe"

Packages are YAML-only — metric definitions, table/column docs, join rules, and proven query patterns. No executable code, ever (security by design). Once installed, the retriever surfaces their definitions automatically when a question matches.

See packages/stripe-semantic/ for the reference package (5 metrics, 5 tables, 4 joins, 6 proven queries covering MRR, active subscriptions, net revenue, customer LTV).

Benchmark

Ships with a 47-case benchmark on the seeded DB for regression tracking:

python -m tests.benchmarks.run_benchmark

Current baseline: 37/47 (78%) execution accuracy, 42/47 (89%) lenient (containment match), 0 silent-wrong answers. gpt-4o-mini and gpt-4o tied at 78% on this suite — model upgrade alone didn't move the needle. Per-phase regression baselines are checked into tests/benchmarks/.

Project layout

src/talkdb/
├── core/          # Engine, SQL generator, intent classifier, dialect transpiler
├── conversation/  # Session state, rewriter, reference resolver
├── schema/        # Introspector, data models, semantic model loader
├── retrieval/     # ChromaDB + BM25 hybrid retriever, embeddings
├── validation/    # Schema, execution, shape, dual-path, confidence
├── insight/       # Analyzer (pandas), charter (matplotlib), narrator (LLM)
├── watchdog/      # Scheduler, baseline, alerter, storage
├── learning/      # Pattern store, feedback recorder
├── registry/      # Community package loader + local index + install client
├── connectors/    # Postgres, SQLite (more dialects via sqlglot)
├── server/        # FastMCP server + FastAPI REST wrapper
└── config/        # Pydantic settings

Design principles

  • Never dump full schema into prompts. Hybrid retrieval (BM25 + vector) surfaces only relevant context.
  • Never return results below confidence threshold. Refuse and explain — silent wrong answers destroy trust.
  • SELECT only. sqlglot AST walk rejects every mutating statement before execution.
  • Read-only validation. All validation queries run in READ ONLY transactions with LIMIT 10 and a 10-second timeout.
  • LLM-provider agnostic. Every LLM call goes through LiteLLM — swap Claude ↔ GPT ↔ Gemini ↔ Ollama with one config change.
  • Vector store abstraction. ChromaDB for dev, pgvector can plug in behind the same interface.
  • Temperature 0 for SQL generation. Deterministic output.
  • Insight stats are pandas, not LLM. Only narration uses the LLM (so the numbers in insights are always correct).
  • Dual-path uses structurally different prompts. Path A direct, Path B decompose-then-compose. Catches correlated errors that self-correction misses.
  • Conversation rewriting, not SQL mutation. Follow-ups are rewritten into standalone questions before SQL is regenerated from scratch.

License

MIT.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured