mcp-ibge

mcp-ibge

Exposes official IBGE data as MCP tools, including Brazilian localities, SIDRA statistical aggregates, and population indicators.

Category
Visit Server

README

mcp-data-br

A collection of Model Context Protocol (MCP) servers for Brazilian public data — typed, traceable, and safe to call from AI agents.

Python MCP CI License Ruff pytest

What is mcp-data-br?

Brazilian public institutions (IBGE, INEP, Banco Central, dados.gov.br, state and city open data portals, ...) publish a huge amount of free, no-API-key data — but spread across many APIs, with inconsistent shapes, encodings and documentation that are hard for an LLM to use safely.

mcp-data-br is a growing collection of small, focused MCP servers — one per data source — that turn those public APIs into typed, traceable tools an agent (Claude Desktop, Cursor, or any MCP-compatible client) can call directly. Every tool across every module follows the same conventions:

  • Typed, validated responses — every tool is backed by Pydantic models.
  • Traceable by design — every response is {"ok": ..., "data": ..., "metadata": {...}, "warnings": [...], "errors": [...]}, with metadata (source_name, source_url, official_source, endpoint, params, retrieved_at, period, territorial_level, license_note, version, cache_hit) so any number can be checked against its official source.
  • Safe by default — no shell execution, no arbitrary file/URL access, outbound requests restricted to an allowlist of official domains, input validation before any network call. See docs/security.md.
  • Local-first — runs over stdio, no API keys, no external services beyond the public data source itself.

The project is organized as a single uv workspace (monorepo): each data source gets its own installable package under packages/, all sharing the same conventions and tooling.

30-second demo

Demo: comparing Rio de Janeiro, Niterói and Maricá with mcp-data-br

🎥 Generated from a real comparar_municipios call — see Regenerating the demo GIF below.

Prompt (typed into Claude Desktop / Cursor / any MCP client):

"Compare Rio de Janeiro, Niterói and Maricá using official Brazilian public data."

What the agent does — no API keys, no scraping, just typed tool calls over stdio:

# 1. Resolve each name to an IBGE municipality (fuzzy match, accent/case-insensitive)
buscar_municipio(nome="Rio de Janeiro")
buscar_municipio(nome="Niterói")
buscar_municipio(nome="Maricá")

# 2. Get the 7-digit IBGE codes
obter_codigo_municipio(nome="Rio de Janeiro", uf="RJ")  # -> 3304557
obter_codigo_municipio(nome="Niterói", uf="RJ")          # -> 3303302
obter_codigo_municipio(nome="Maricá", uf="RJ")           # -> 3302700

# 3. Check which indicators are available before asking for them
#    (today: estimated population, agregado SIDRA 6579)

# 4. One call does the rest: resolves + compares + cites sources
comparar_municipios(
    municipios=[
        {"nome": "Rio de Janeiro", "uf": "RJ"},
        {"nome": "Niterói", "uf": "RJ"},
        {"nome": "Maricá", "uf": "RJ"},
    ],
    indicadores=["populacao_estimada"],
)

Final answer, straight from comparar_municipios:

Município UF Estimated population (2025)
Rio de Janeiro RJ 6,730,729
Niterói RJ 516,787
Maricá RJ 212,470

Live data as of when docs/assets/demo.gif was generated — run the tool yourself for the current SIDRA period.

  • Source: IBGE — Agregados/SIDRA, table 6579 (Estimativas de População), period 2025 — every row in data.fontes is a direct, openable URL on servicodados.ibge.gov.br.
  • Warnings: none for this query — but if a municipality name were ambiguous, not found, or an indicator weren't implemented yet, that would show up explicitly in warnings / data.municipios_nao_resolvidos / data.indicadores_nao_implementados instead of a guessed number. See the compare_municipalities recipe for the full request/response and error cases.

In other words:

  • Local-first — runs over stdio on your machine, no hosted backend.
  • No API keys — every data source is a free, public government API.
  • Official data sources — every value is traceable to a servicodados.ibge.gov.br endpoint via metadata/data.fontes.
  • Structured responses{"ok", "data", "metadata", "warnings", "errors"} every time, ready for an agent to parse and act on.
  • Safe by default — no shell access, no arbitrary URLs, outbound requests restricted to an allowlist (see docs/security.md).
  • Agent-ready — typed tools an LLM can call directly, with ready-made recipes in examples/agent_recipes/.

Try it yourself

git clone https://github.com/FilipePessoa30/mcp-data-br.git mcp-data-br
cd mcp-data-br
uv sync --all-extras
uv run mcp-ibge

Minimal MCP client config (e.g. claude_desktop_config.json):

{
  "mcpServers": {
    "ibge": {
      "command": "uvx",
      "args": ["mcp-ibge"]
    }
  }
}

Then ask the prompt above. More configs (Cursor, Open WebUI, dev/local checkout) are in examples/; more ready-made prompts are in examples/agent_recipes/.

Regenerating the demo GIF

docs/assets/demo.gif is generated by scripts/generate_demo_gif.py, which:

  1. Connects to the local mcp-ibge over stdio and calls comparar_municipios for Rio de Janeiro, Niterói and Maricá (scripts/demo_compare_municipios.py) — so the numbers in the GIF are live data from the IBGE API, not hand-typed values.
  2. Renders a terminal-style animation (typing effect + the resulting table) with Pillow and saves it as a GIF.

To regenerate it:

uv run --with pillow python scripts/generate_demo_gif.py

No extra system dependencies (ffmpeg, ttyd, VHS, asciinema) are required — everything runs through uv. If you'd rather record a real terminal/MCP client session instead, VHS or asciinema + agg both work too; just save the result to docs/assets/demo.gif and keep it under ~30 seconds so it stays small enough for the README to load quickly.

Available modules

Module Status Data Docs
mcp-ibge Stable IBGE — geographic locations (regions, states, municipalities, districts) and Agregados/SIDRA statistical aggregates README · docs
mcp-dados-gov-br Beta dados.gov.br — Portal Brasileiro de Dados Abertos: dataset, organization, group and tag catalog search (CKAN API) README · docs
mcp-inep Planning (scaffold, only status tool) INEP — Censo Escolar, Ideb, Saeb, Enem, schools by município, education indicators README · roadmap
mcp-bcb Planning (scaffold, only status tool) Banco Central do Brasil — SGS time series, exchange rates (PTAX), Selic README · roadmap
mcp-rio Planning (scaffold, only status tool) Data.Rio — open data from the City of Rio de Janeiro README · roadmap
mcp-saude Planning (scaffold, only status tool) DataSUS / Ministério da Saúde — health facilities (CNES) and indicators by município/UF README · roadmap
mcp-transparencia Planning (scaffold, only status tool) Portal da Transparência (CGU) — public spending, contracts and sanctions README · roadmap
mcp-tesouro Planning (scaffold, only status tool) Tesouro Nacional (SICONFI) — fiscal data for União, states and municípios README · roadmap

Secure by default

Every module in mcp-data-br follows the same security baseline — see docs/security.md for the full model and packages/mcp_ibge/docs/security.md for the mcp-ibge implementation details:

  1. No shell execution — pure Python + httpx, no subprocess/os.system.
  2. No local file access — tools never read or write user files; the only file I/O is loading .env config at startup.
  3. No arbitrary URLs — tools take structured identifiers (codes, names, periods), never a full URL.
  4. Allowlisted domains only — outbound requests are restricted to a fixed set of official hosts per module (e.g. https://servicodados.ibge.gov.br for mcp-ibge, https://dados.gov.br for mcp-dados-gov-br), checked both at startup and before every request.
  5. Timeouts on every outbound HTTP call.
  6. Response size limits — oversized responses are aborted, not buffered.
  7. Input validation before any network call — invalid parameters return a structured error instead of reaching the upstream API.
  8. No stack traces in errors — the MCP client gets a short error message; full tracebacks stay in stderr logs.
  9. stdio-safe logging — all logs go to stderr, never stdout.
  10. No required API keys — every data source works unauthenticated for read access. mcp-dados-gov-br accepts an optional consumer token (DADOS_GOV_BR_API_TOKEN) for the few resources that require it; if unset, those tools return a clear error instead of failing silently.

This is implemented by a small, centralized mcp_ibge.security module (assert_allowed_url, response_size_guard, safe_error_response), covered by tests/test_security.py (external URL attempts, oversized responses, malicious inputs) — future modules follow the same pattern.

Planned modules

mcp-data-br is designed to grow. Beyond mcp-ibge (stable) and mcp-dados-gov-br (beta), six modules already exist as scaffolds — installable packages exposing only the shared status tool, with their data tools, sources and challenges documented in docs/modules/: mcp-inep (education — Censo Escolar, Ideb, Saeb, Enem), mcp-bcb (Banco Central indicators), mcp-rio (Rio de Janeiro open data), mcp-saude (DataSUS health data), mcp-transparencia (Portal da Transparência) and mcp-tesouro (Tesouro Nacional / SICONFI). mcp-sidra (a dedicated SIDRA module, split out of mcp-ibge) remains a documentation-only proposal — see docs/modules/sidra.md. See docs/roadmap.md for details — the workspace is structured so each scaffold's data tools can be implemented independently, without touching other modules.

Quick start

Requires Python 3.11+ and uv.

git clone https://github.com/FilipePessoa30/mcp-data-br.git mcp-data-br
cd mcp-data-br
uv sync --all-extras
uv run mcp-ibge

This starts the mcp-ibge server over stdio. For ready-to-use MCP client configs (Claude Desktop, Cursor, Open WebUI) and example prompts, see examples/ and packages/mcp_ibge/docs/client_setup.md.

Try it without an MCP client

The mcp-data-br CLI calls the same tools directly from the terminal and prints JSON — handy for testing without configuring Claude Desktop or another MCP client:

uv run mcp-data-br ibge estados
uv run mcp-data-br ibge municipios --uf RJ
uv run mcp-data-br sidra metadados --agregado 6579 --pretty

See packages/mcp_ibge/README.md#cli-mcp-data-br for the full command reference.

Run it in Docker

A Dockerfile and docker-compose.yml are provided at the repository root, so the server can run in an isolated container without installing Python or uv on the host:

docker build -t mcp-ibge .
docker run -i --rm mcp-ibge                  # stdio (default)
docker compose up -d                         # streamable-http

See docs/docker.md for the full guide (environment variables, healthcheck, MCP client config for docker run).

For the full feature list, available tools, configuration options and roadmap of the IBGE module, see packages/mcp_ibge/README.md.

Project layout

mcp-data-br/
├── pyproject.toml          # uv workspace root (virtual project)
├── packages/
│   ├── mcp_ibge/             # mcp-ibge: IBGE Localidades + Agregados/SIDRA (stable)
│   │   ├── src/mcp_ibge/
│   │   ├── tests/
│   │   ├── docs/
│   │   └── README.md
│   ├── mcp_dados_gov_br/     # mcp-dados-gov-br: dados.gov.br catalog (beta)
│   ├── mcp_inep/             # mcp-inep: INEP education data (scaffold, only `status`)
│   ├── mcp_bcb/              # mcp-bcb: Banco Central indicators (scaffold, only `status`)
│   ├── mcp_rio/              # mcp-rio: Data.Rio open data (scaffold, only `status`)
│   ├── mcp_saude/            # mcp-saude: DataSUS health data (scaffold, only `status`)
│   ├── mcp_transparencia/    # mcp-transparencia: Portal da Transparência (scaffold, only `status`)
│   └── mcp_tesouro/          # mcp-tesouro: Tesouro Nacional / SICONFI (scaffold, only `status`)
├── docs/                    # Monorepo-level docs (architecture, roadmap, security, data sources)
├── examples/                # MCP client configs (Claude Desktop, Cursor, Open WebUI) and prompts
└── evals/                   # Evaluation datasets and reports (placeholder)

Every scaffold package (mcp_inep and newer) follows the same internal layout: src/<pkg>/{server.py, config.py, schemas/, clients/, services/, tools/, utils/} plus tests/ — see docs/architecture.md#adding-a-new-module.

See docs/architecture.md for how the workspace and modules are organized, and docs/architecture.md#adding-a-new-module for what a new module needs.

Documentation

Contributing

Contributions are welcome — bug reports, new tools, new modules, documentation and tests. See CONTRIBUTING.md for the development setup (uv workspace, lint/format/test commands) and guidelines.

🇧🇷 Sobre o projeto (resumo em português)

mcp-data-br é uma coleção de servidores MCP para dados públicos brasileiros, organizados como um único workspace (monorepo) onde cada fonte de dados ganha seu próprio pacote em packages/. A entrega estável é o mcp-ibge, com dados de localidades e agregados do SIDRA do IBGE — veja packages/mcp_ibge/README.md. O mcp-dados-gov-br, em beta, expõe busca e detalhamento de datasets, organizações, grupos e tags do Portal Brasileiro de Dados Abertos (CKAN) — veja packages/mcp_dados_gov_br/README.md. Além desses, o workspace já inclui seis módulos em estágio de scaffold (pacote instalável, apenas com a tool status, documentação e roadmap prontos): mcp-inep (educação/INEP), mcp-bcb (Banco Central), mcp-rio (dados abertos do Rio de Janeiro), mcp-saude (DataSUS), mcp-transparencia (Portal da Transparência) e mcp-tesouro (Tesouro Nacional/SICONFI). Todos seguem as mesmas convenções de respostas tipadas, rastreáveis e seguras — veja docs/roadmap.md e docs/modules/.

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured