MCP Servers

source-coop-mcp

Enables AI agents to discover and access 800TB+ of public geospatial data from Source Cooperative, with tools for listing organizations, products, files, and fuzzy search.

README

Source Cooperative MCP Server

Discover and access 800TB+ of geospatial data through AI agents.

An MCP (Model Context Protocol) server for Source Cooperative - a collaborative repository with datasets from Maxar, Harvard, ESA, USGS, and 90+ organizations.

🏗️ Architecture Overview

graph TB
    subgraph "AI Clients"
        A1[Claude Desktop]
        A2[Claude Code]
        A3[Cursor]
        A4[Cline]
        A5[Zed]
        A6[Continue.dev]
    end

    subgraph "MCP Server"
        MCP[Source Cooperative MCP<br/>FastMCP + obstore]
    end

    subgraph "6 Available Tools"
        T1[list_accounts<br/>94+ orgs]
        T2[list_products<br/>hybrid S3+API]
        T3[get_product_details<br/>+ README]
        T4[list_product_files<br/>tree mode]
        T5[get_file_metadata<br/>no download]
        T6[search<br/>hybrid fuzzy]
    end

    subgraph "Data Sources"
        S1[HTTP API<br/>source.coop/api]
        S2[S3 Direct<br/>opendata.source.coop]
    end

    A1 -->|JSON-RPC| MCP
    A2 -->|JSON-RPC| MCP
    A3 -->|JSON-RPC| MCP
    A4 -->|JSON-RPC| MCP
    A5 -->|JSON-RPC| MCP
    A6 -->|JSON-RPC| MCP

    MCP --> T1
    MCP --> T2
    MCP --> T3
    MCP --> T4
    MCP --> T5
    MCP --> T6

    T1 --> S2
    T2 --> S1
    T2 --> S2
    T3 --> S1
    T3 --> S2
    T4 --> S2
    T5 --> S2
    T6 --> S1

    style MCP fill:#4CAF50,stroke:#2E7D32,stroke-width:3px,color:#fff
    style S1 fill:#2196F3,stroke:#1976D2,stroke-width:2px,color:#fff
    style S2 fill:#2196F3,stroke:#1976D2,stroke-width:2px,color:#fff

Key Features:

✅ Token Optimized - 72% reduction for large datasets
✅ Smart Partitions - Auto-detects Hive-style patterns
✅ Fuzzy Search - Handles typos and partial matches
✅ No Auth - All 800TB+ is public

🚀 Quick Start

Install

uvx source-coop-mcp

Configure Your AI Client

Claude Desktop / Claude Code / Cursor / Cline

Add to config file:

Claude Desktop: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
Claude Code: VS Code settings.json
Cursor: Cursor settings
Cline: Cline MCP settings

{
  "mcpServers": {
    "source-coop": {
      "command": "uvx",
      "args": ["source-coop-mcp"]
    }
  }
}

Zed

Add to Zed settings:

{
  "context_servers": {
    "source-coop": {
      "command": "uvx",
      "args": ["source-coop-mcp"]
    }
  }
}

Continue.dev

Add to Continue config (~/.continue/config.json):

{
  "experimental": {
    "modelContextProtocolServers": [
      {
        "transport": {
          "type": "stdio",
          "command": "uvx",
          "args": ["source-coop-mcp"]
        }
      }
    ]
  }
}

Restart your AI client and start exploring!

🛠️ Available Tools

Tool	Purpose	Performance
`list_accounts()`	Find all 94+ organizations	~850ms
`list_products()`	Hybrid: S3 mode (default) for ALL datasets + file counts	~240ms
`list_products(include_unpublished=False)`	API mode for published datasets with rich metadata	~500ms
`get_product_details()`	Get metadata + README automatically	~650ms
`list_product_files()`	List files with S3/HTTP paths	~240ms
`list_product_files(show_tree=True)`	Tree view (72% token savings)	~980ms
`get_file_metadata()`	Get file info without downloading	~230ms
`search(query)`	Hybrid: Search accounts + products (published + unpublished), top 5 results	~5-10s

💡 What You Can Do

Discover Data

"List all organizations in Source Cooperative"
→ Returns 94+ organizations: maxar, planet, harvard, etc.

"Find all datasets for harvard-lil"
→ Discovers published + unpublished products

"Search for climate datasets"
→ Smart fuzzy search handles typos and partial matches

Access Files

"List files in harvard-lil/gov-data"
→ Returns S3 paths and HTTP URLs ready for analysis

"Show me the file tree with partition detection"
→ Smart visualization: year={2020,2021,...+5 more}/ [partitioned]

"Get file metadata without downloading"
→ Size, last modified, ETag

Smart Search

"Search for climte" (typo)
→ Finds "climate" datasets (fuzzy matching)

"Search for geo" (partial)
→ Finds "geospatial", "geocoding", etc.

⚡ Features

Feature	Description
Complete Discovery	Finds unpublished products the official API doesn't show
No Authentication	All 800TB+ data is public
Fast Performance	Rust-backed S3 client (9x faster than boto3)
Token Optimized	Tree mode: 72% token reduction for large datasets
Smart Partitions	Auto-detects patterns: `year={2020,2021,...}`
Fuzzy Search	Handles typos and partial matches
README Integration	Documentation automatically included
800TB+ Data	94+ organizations, geospatial datasets

📋 Example Workflow

1. "List all organizations"
   → Get 94+ account names

2. "Show me all datasets from maxar"
   → Discover published + unpublished products

3. "Search for climate data"
   → Smart fuzzy search finds relevant datasets

4. "Get details for harvard-lil/gov-data"
   → Full metadata + README content

5. "List files in this dataset with tree view"
   → Token-optimized tree with partition detection

🎯 Why This Server?

Problem

Source Cooperative has 800TB+ of valuable data, but:

Official API only shows published products
No auto-discovery of organizations
Requires knowing what you're looking for

Solution

This MCP server provides:

✅ Complete auto-discovery (published + unpublished)
✅ Smart search with fuzzy matching
✅ Direct S3 access for all files
✅ Token-optimized outputs (72% reduction)
✅ Smart partition detection (10-88% additional savings)
✅ README documentation included automatically
✅ No authentication required

📊 Performance

All operations complete in under 1 second:

list_accounts():                          ~850ms  (94+ organizations)
list_products():                          ~240ms  (S3 mode - ALL datasets + file counts)
list_products(include_unpublished=False): ~500ms  (API mode - published with metadata)
list_product_files():                     ~240ms  (simple list)
list_product_files(tree=True):            ~980ms  (72% token savings)
get_file_metadata():                      ~230ms  (HEAD only)
search(query):                            ~5-10s  (hybrid search - 1 recursive S3 scan, top 5 enriched)

Token Optimization Impact

Dataset Size	Without Tree	With Tree	Saved
10 files	1,500 tokens	415 tokens	72.3%
100 files	15,000 tokens	4,150 tokens	72.3%
1,000 files	150,000 tokens	41,500 tokens	72.3%

With partition detection (1,000 partitions): 88% total savings!

🔧 Requirements

Python: 3.11 or higher
Package Manager: uv (installed automatically by uvx)
Operating Systems: macOS, Linux, Windows

🤝 Development

See DEVELOPMENT.md for:

Architecture details
Testing instructions
Contributing guidelines
Performance benchmarks
Token optimization details

📝 Support

Issues: GitHub Issues

📄 License

MIT License - see LICENSE for details.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured