MCP Servers

DataSF MCP Server

Enables AI assistants to search, explore, and query San Francisco's open data portal through a standardized interface for public datasets. It supports SQL-like querying via the Socrata platform and includes features like fuzzy column matching and schema caching.

README

DataSF MCP Server

A Model Context Protocol (MCP) server that provides LLMs with seamless access to San Francisco's open data portal (DataSF), powered by the Socrata platform.

Overview

This MCP server enables AI assistants like Claude to search, explore, and query San Francisco's public datasets through a simple, standardized interface. It handles the complexity of the Socrata API, provides intelligent column name correction, and includes schema caching for optimal performance.

Key Features

🔍 Dataset Search & Discovery - Find datasets by keywords or browse by category
📊 Schema Retrieval - Get column names and data types before querying
💬 SoQL Query Execution - Run SQL-like queries against any dataset
🎯 Fuzzy Column Matching - Auto-corrects typos in column names
⚡ Schema Caching - Reduces API calls with intelligent caching
🔐 Optional Authentication - Supports Socrata App Tokens for higher rate limits
✅ Property-Based Testing - Comprehensive correctness guarantees

Available Tools

1. `search_datasf`

Search for datasets by keywords.

Parameters:

query (string, required): Search keywords (1-500 characters)
limit (number, optional): Max results (default: 5, max: 20)

Example:

Search for police incident datasets

2. `list_datasf`

Browse available datasets, optionally filtered by category.

Parameters:

category (string, optional): Filter by category
limit (number, optional): Max results (default: 5, max: 20)

Example:

List recent public safety datasets

3. `get_schema`

Get the schema (columns and data types) for a specific dataset.

Parameters:

dataset_id (string, required): Dataset 4x4 ID (format: xxxx-xxxx)

Example:

Get the schema for dataset wg3w-h783

4. `query_datasf`

Execute a SoQL (Socrata Query Language) query against a dataset.

Parameters:

dataset_id (string, required): Dataset 4x4 ID
soql (string, required): SoQL query (1-4000 characters)
auto_correct (boolean, optional): Enable column name correction (default: true)

Example:

Query dataset wg3w-h783: SELECT incident_category, COUNT(*) GROUP BY incident_category LIMIT 10

Installation

Prerequisites

Node.js 18 or higher
npm or yarn

Local Setup (Optional)

If you want to run or modify the server locally:

Clone the repository:

git clone https://github.com/fwextensions/datasf-mcp.git
cd datasf-mcp

Install dependencies:

npm install

Run the server:

npm start

The server uses tsx to run TypeScript directly without a build step.

Usage

Testing with MCP Inspector

For the MCP Inspector, you'll need to use the local installation:

# First, clone and install locally
git clone https://github.com/fwextensions/datasf-mcp.git
cd datasf-mcp
npm install

# Then run the inspector
npx -y @modelcontextprotocol/inspector tsx src/index.ts

In the inspector UI, use:

Command: tsx
Arguments: src/index.ts (or absolute path if running from outside the directory)

Quick Start with npx (Recommended)

The easiest way to use the server is directly from GitHub using npx:

{
  "mcpServers": {
    "datasf": {
      "command": "npx",
      "args": ["-y", "github:fwextensions/datasf-mcp"],
      "env": {
        "SOCRATA_APP_TOKEN": "your-optional-token"
      }
    }
  }
}

This will automatically download and run the latest version from GitHub without any manual installation.

Local Installation

Alternatively, clone and install locally:

git clone https://github.com/fwextensions/datasf-mcp.git
cd datasf-mcp
npm install

Then use the absolute path in your MCP configuration (see below).

Configuration for Claude Desktop

Add to your Claude Desktop config file:

Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json

Option 1: Using npx (recommended)

{
  "mcpServers": {
    "datasf": {
      "command": "npx",
      "args": ["-y", "github:fwextensions/datasf-mcp"],
      "env": {
        "SOCRATA_APP_TOKEN": "your-optional-token"
      }
    }
  }
}

Option 2: Using local installation

{
  "mcpServers": {
    "datasf": {
      "command": "npx",
      "args": ["tsx", "/absolute/path/to/datasf-mcp/src/index.ts"],
      "env": {
        "SOCRATA_APP_TOKEN": "your-optional-token"
      }
    }
  }
}

Important: Replace /absolute/path/to/datasf-mcp with the actual full path to where you cloned this project.

Configuration for Kiro IDE

Create or edit .kiro/settings/mcp.json:

Option 1: Using npx from GitHub (recommended)

{
  "mcpServers": {
    "datasf": {
      "command": "npx",
      "args": ["-y", "github:fwextensions/datasf-mcp"],
      "env": {
        "SOCRATA_APP_TOKEN": "your-optional-token"
      },
      "disabled": false,
      "autoApprove": []
    }
  }
}

Option 2: Using local installation

{
  "mcpServers": {
    "datasf": {
      "command": "npx",
      "args": ["tsx", "src/index.ts"],
      "env": {
        "SOCRATA_APP_TOKEN": "your-optional-token"
      },
      "disabled": false,
      "autoApprove": []
    }
  }
}

Getting a Socrata App Token

The server works without authentication for public data, but an App Token increases rate limits:

Visit https://data.sfgov.org/
Sign up for a free account
Navigate to Developer Settings
Create a new App Token
Add it to your MCP configuration

Development

Project Structure

datasf-mcp-server/
├── src/
│   ├── index.ts              # MCP server entry point
│   ├── socrataClient.ts      # Socrata API client
│   ├── validator.ts          # Input validation with Zod
│   ├── fuzzyMatcher.ts       # Column name auto-correction
│   ├── cache.ts              # Schema caching
│   ├── errorHandler.ts       # Error handling utilities
│   └── __tests__/
│       └── property/         # Property-based tests
├── dist/                     # Compiled JavaScript output
├── package.json
└── tsconfig.json

Available Scripts

npm run build - Compile TypeScript to JavaScript
npm start - Run the compiled server
npm test - Run all tests
npm run test:watch - Run tests in watch mode

Running Tests

npm test

The project uses property-based testing with fast-check to ensure correctness across a wide range of inputs.

Architecture

The server follows a modular architecture:

MCP Server - Handles protocol communication via stdio
Socrata Client - Manages HTTP requests to Socrata APIs
Validator - Validates all inputs using Zod schemas
Fuzzy Matcher - Corrects column name typos using Fuse.js
Schema Cache - Caches dataset schemas in memory (5-minute TTL)
Error Handler - Classifies and formats errors for LLM consumption

Example Queries

Once configured in your LLM, you can ask questions like:

"Search for datasets about housing in San Francisco"
"What's the schema for the police incidents dataset (wg3w-h783)?"
"Show me the top 10 incident categories from the police incidents dataset"
"Find all building permits issued in 2024"
"What datasets are available about transportation?"

API Endpoints Used

The server interacts with three Socrata APIs:

Discovery API: https://api.us.socrata.com/api/catalog/v1 - Dataset search and browsing
Views API: https://data.sfgov.org/api/views/{id}.json - Schema retrieval
Resource API: https://data.sfgov.org/resource/{id}.json - Data querying

Error Handling

The server provides descriptive error messages for:

Validation errors - Invalid input format or length
Not found - Dataset doesn't exist
Rate limiting - Too many requests (add App Token to resolve)
Timeouts - Request exceeded 30 seconds
API errors - Socrata-specific errors (e.g., SoQL syntax errors)

Contributing

Contributions are welcome! The project uses:

TypeScript for type safety
Zod for runtime validation
fast-check for property-based testing
Vitest as the test runner

License

MIT

Resources

Troubleshooting

Server not starting

Ensure you ran npm run build first
Check that Node.js 18+ is installed

Tools not showing up in LLM

Verify the path in your config is absolute
Restart your LLM application after adding the config
Check the LLM's logs for connection errors

Rate limiting errors

Add a Socrata App Token to your configuration
Reduce the frequency of requests

Column name errors in queries

Use get_schema first to see valid column names
Enable auto_correct: true (default) for automatic typo correction

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

E2B

Using MCP to run code via e2b.

Official

Featured

DataSF MCP Server

README

DataSF MCP Server

Overview

Key Features

Available Tools

1. search_datasf

2. list_datasf

3. get_schema

4. query_datasf

Installation

Prerequisites

Local Setup (Optional)

Usage

Testing with MCP Inspector

Quick Start with npx (Recommended)

Local Installation

Configuration for Claude Desktop

Configuration for Kiro IDE

Getting a Socrata App Token

Development

Project Structure

Available Scripts

Running Tests

Architecture

Example Queries

API Endpoints Used

Error Handling

Contributing

License

Resources

Troubleshooting

Recommended Servers

1. `search_datasf`

2. `list_datasf`

3. `get_schema`

4. `query_datasf`