StatCan Web Data Service MCP Server
an MCP server to use StatCAN data
Aryan-Jhaveri
README
StatCan Web Data Service MCP Server
<div align="center"> <img src="./assets/StatCan-Header.png" alt="Statistics Canada Logo" width="600"/> </div>
A MCP (Model Context Protocol) server that provides access to Statistics Canada's Web Data Service, enabling AI assistants to discover, explore, analyze, and cite Canadian statistical data through natural language.
Project Overview
This server addresses several technical challenges in accessing Statistics Canada's Web Data Service (WDS) API:
- API Format Requirements: Identified correct formats for StatCan WDS API endpoints to resolve 406 errors
- Resilient Data Access: Implements multi-tier caching and fallbacks for API limitations
- Enhanced Metadata: Provides rich context for statistical interpretation and proper citation
- Analysis Capabilities: Includes statistical analysis, visualization, and forecasting features
- MCP Integration: Connects with other MCP servers for expanded functionality
Features
- 🔍 Dataset Discovery: Search and browse StatCan datasets by keywords, themes, or geography
- 📊 Data Retrieval: Extract time series data with proper formatting for key vectors
- 📝 Metadata Exploration: Access detailed information about dataset structure and content
- 💾 Persistent Storage: Store datasets for future use with SQLite backend
- 📊 Advanced Analysis: Perform comprehensive statistical analysis, trend detection, seasonality analysis, and forecasting
- 📈 Visualizations: Generate data visualizations with integration to Vega-Lite
- 📑 Citations: Generate properly formatted citations for StatCan data
- 🖼️ Figure References: Track and reference figures created from StatCan data
- 🔄 API Resilience: Robust error handling with fallbacks for API limitations
Current Limitations
The StatCan WDS API has several limitations that this server addresses:
- Data Retrieval Constraints: Some API endpoints remain problematic despite correct formatting
- Format Sensitivity: Vector IDs must be numeric without the 'v' prefix, and payloads must be in array format
- Coordinate Access: Vector-based queries are more reliable than coordinate-based queries
- Performance Issues: Some API calls may timeout for large requests or during peak times
- Rate Limiting: High-volume queries may be throttled by the StatCan WDS API
The current implementation uses these strategies to work around these limitations:
- Multi-tier caching system at metadata, vector, and cube levels
- Local fallbacks for common statistical indicators
- Automatic format adjustment and retries with exponential backoff
- Graceful degradation to cached data when API endpoints fail
See docs/implementation_status.md and docs/api_connection_guide.md for details.
Quick Start
# Clone the repository
git clone https://github.com/yourusername/mcp-statcan.git
cd mcp-statcan
# Create a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -e .
# Start the MCP server
python -m src
Dependencies
pip install sqlitedict aiohttp mcp pydantic python-dotenv pandas numpy
Usage with Claude
- Open Claude Desktop App
- Go to Settings > MCP Servers
- Add a new server with the following configuration:
- Name: StatCan Data
- Command:
path/to/venv/bin/python -m src
- Start chatting with Claude and ask about Canadian statistics!
Working Example Queries
Here are queries that work reliably with the current implementation:
Basic Data Discovery
- "Find datasets about consumer prices in Canada"
- "What datasets do you have about employment?"
- "Show me the latest CPI data"
Vector-Based Data Retrieval
- "Get data for CPI vector 41690973"
- "Retrieve GDP data from vector 21581063"
- "Get the latest values for employment vector 111955426"
Analysis and Visualization
- "Generate a line chart for CPI data over the last 5 years"
- "Analyze the trend in GDP for the past 10 quarters"
- "Create a visualization of unemployment rate changes"
Citations
- "Generate a citation for the Consumer Price Index dataset"
- "How should I cite Statistics Canada's GDP data in APA format?"
- "Create a reference for the Labour Force Survey"
Testing
To verify the API client works correctly:
python -m tests.api.api_connection_steps
This runs step-by-step tests for:
- API connectivity
- Metadata retrieval
- Vector data access
- Format requirements
Project Structure
/src
- Core server implementation/docs
- Documentation and guides/tests
- Test suite for API and functionality/docs/references
- API specifications and code sets
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgements
This project uses data from Statistics Canada, accessed via their Web Data Service API. It is not affiliated with or endorsed by Statistics Canada.
The Statistics Canada logo is used for informational purposes only to indicate the data source.
Recommended Servers
Crypto Price & Market Analysis MCP Server
A Model Context Protocol (MCP) server that provides comprehensive cryptocurrency analysis using the CoinCap API. This server offers real-time price data, market analysis, and historical trends through an easy-to-use interface.
MCP PubMed Search
Server to search PubMed (PubMed is a free, online database that allows users to search for biomedical and life sciences literature). I have created on a day MCP came out but was on vacation, I saw someone post similar server in your DB, but figured to post mine.
dbt Semantic Layer MCP Server
A server that enables querying the dbt Semantic Layer through natural language conversations with Claude Desktop and other AI assistants, allowing users to discover metrics, create queries, analyze data, and visualize results.
mixpanel
Connect to your Mixpanel data. Query events, retention, and funnel data from Mixpanel analytics.

Sequential Thinking MCP Server
This server facilitates structured problem-solving by breaking down complex issues into sequential steps, supporting revisions, and enabling multiple solution paths through full MCP integration.

Nefino MCP Server
Provides large language models with access to news and information about renewable energy projects in Germany, allowing filtering by location, topic (solar, wind, hydrogen), and date range.
Vectorize
Vectorize MCP server for advanced retrieval, Private Deep Research, Anything-to-Markdown file extraction and text chunking.
Mathematica Documentation MCP server
A server that provides access to Mathematica documentation through FastMCP, enabling users to retrieve function documentation and list package symbols from Wolfram Mathematica.
kb-mcp-server
An MCP server aimed to be portable, local, easy and convenient to support semantic/graph based retrieval of txtai "all in one" embeddings database. Any txtai embeddings db in tar.gz form can be loaded
Research MCP Server
The server functions as an MCP server to interact with Notion for retrieving and creating survey data, integrating with the Claude Desktop Client for conducting and reviewing surveys.