Pipetable
Query local CSV, Parquet, JSON and TSV files with real SQL via DuckDB. Gives your AI coding tool ground-truth data access instead of hallucinated answers.
README
Pipetable
Gives your AI coding tool real data access.
Point it at a folder of CSV, Parquet, JSON, or TSV files — your AI can now query them with real SQL instead of hallucinating.
Works as an MCP server for Claude Code, Cursor, RooCode, and Copilot. Also ships as a standalone CLI for interactive data exploration. Powered by DuckDB. Files never leave your machine.
MIT licensed.
Install
# macOS / Linux
curl -fsSL https://pipetable.com/install | sh
# Windows
irm https://pipetable.com/install.ps1 | iex
# Rust
cargo install pipetable
MCP server setup
Claude Code
claude mcp add pipetable pipetable mcp
Cursor / RooCode
{
"mcpServers": {
"pipetable": {
"command": "pipetable",
"args": ["mcp"]
}
}
}
VS Code (Copilot)
{
"servers": {
"pipetable": {
"type": "stdio",
"command": "pipetable",
"args": ["mcp"]
}
}
}
Once configured, your AI can:
scan_folder— register all data files in a folderlist_datasets— see schemas and column typesget_schema— inspect a specific table with sample rowsexecute_sql— run real DuckDB SQL against your files
Results are ground truth from DuckDB, not generated.
CLI
pipetable ~/data/
SQL and natural language at the > prompt. SQL always works. Natural language requires Ollama running locally.
> SELECT region, SUM(revenue) AS total FROM sales GROUP BY 1 ORDER BY 2 DESC
4 row(s)
region total
─────────────
EU 141000
US 32000
APAC 17000
> show me top 5 customers by revenue
Using: customers, sales
Thinking.....
SELECT c.name, SUM(s.revenue) AS total FROM customers c
JOIN sales s ON s.customer_id = c.id
GROUP BY c.name ORDER BY total DESC LIMIT 5
...
→ piped as _last
Piping results
Every query saves its result as _last — a live DuckDB view you can query further:
> SELECT * FROM sales WHERE region = 'EU'
...
→ piped as _last
> show me top 3 from _last
Using: _last
Thinking.....
Dot commands
| Command | Description |
|---|---|
.scan <path> |
Load a folder or file (Tab completes) |
.datasets |
List loaded datasets |
.schema <name> |
Columns + sample rows |
.drop <name> |
Remove a dataset from the session |
.use <n1> <n2> |
Focus NL queries on specific datasets |
.remove <name> |
Remove from focus |
.clear |
Reset focus to all datasets |
.model <name> |
Switch Ollama model |
.help |
Show help |
Tab completes dataset names after FROM, JOIN, .schema, .drop, .use.
One-shot query
pipetable ask "who has the highest revenue?" ~/data/
pipetable ask "SELECT * FROM sales LIMIT 5" ~/data/
Natural language (optional)
Set any one of these — pipetable auto-detects:
# Claude (best quality)
export ANTHROPIC_API_KEY=sk-ant-...
# OpenAI or any compatible API (LM Studio, Groq, Together, etc.)
export OPENAI_API_KEY=sk-...
export OPENAI_BASE_URL=http://localhost:1234 # optional, for local endpoints
# Ollama (local, no key needed)
ollama pull qwen2.5-coder:1.5b
ollama serve
Priority: Anthropic → OpenAI-compatible → Ollama. SQL and MCP work without any of them.
Supported formats
CSV, Parquet, JSON, NDJSON, TSV, Excel (xlsx, xls, xlsm). Files up to 2GB. Folders scanned up to 3 levels deep. Hidden files and common noise directories (node_modules, target, .git) are skipped automatically.
License
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.