MCP Servers

timesfm-mcp

Local MCP server for GPU-backed TimesFM 2.5 forecasting, enabling zero-shot time-series forecasting, covariate forecasting, anomaly detection, and CSV forecasting via MCP tools.

README

timesfm-mcp

Local MCP server for GPU-backed TimesFM 2.5 forecasting.

This package exposes Google's TimesFM model to MCP clients over stdio, sse, or streamable-http. It is intended for local or trusted-network serving where agents need zero-shot time-series forecasts, prediction intervals, covariate forecasting, CSV forecasting, and interval-based anomaly scoring.

What It Provides

guidance: agent-facing usage guide for safe TimesFM forecasting.
health: package, CUDA, system, and model-state report with a CUDA matmul probe.
estimate_memory: rough dataset memory estimate before loading large jobs.
warmup: lazy-load and compile TimesFM on the GPU before live requests.
forecast_values: forecast in-memory numeric series.
forecast_csv: forecast numeric columns in a local CSV and write CSV or JSON.
forecast_with_covariates_values: TimesFM 2.5 XReg forecasts with known future covariates.
detect_anomalies: compare future actuals against q20/q80 and q10/q90 forecast bands.
timesfm://forecasting/guide: MCP resource with the same operational guidance.
timesfm_forecasting_guide: MCP prompt for agents before planning a forecast.

The model is a process singleton. It is loaded on first warmup or forecast call and remains in memory until the MCP server process exits. If a request changes model settings such as max_context, max_horizon, batch_size, or infer_is_positive, the server reloads the model with the new settings.

When To Use

Use this server for zero-shot univariate time-series forecasting:

Sales, demand, revenue, traffic, inventory, and capacity planning.
Sensor readings, vitals, load, weather, prices, and measurements.
Probabilistic forecasts where q10 through q90 prediction bands matter.
Known-future-covariate forecasts, such as price, promotion, holiday, weather, store attributes, product family, or region effects.
Forecast-vs-actual anomaly review using prediction intervals.

Do not use it for classification, clustering, causal interpretation, coefficient analysis, general tabular prediction, or model fine-tuning. Fine-tuning is a training workflow and is intentionally not exposed by this inference MCP server.

GPU Setup With uv

For RTX 5090 and other new NVIDIA GPUs, install a PyTorch wheel that supports the GPU architecture before installing this package. CUDA 12.8 wheels are the recommended starting point for this machine class.

git clone https://github.com/chokukil/timesfm-mcp.git
cd timesfm-mcp

uv venv .venv-gpu --python 3.10
source .venv-gpu/bin/activate

uv pip install --upgrade --reinstall \
  torch torchvision torchaudio \
  --index-url https://download.pytorch.org/whl/cu128

uv pip install -e ".[gpu]"
uv pip check

.[gpu] installs the TimesFM torch, XReg, and Flax-related extras, including einshape. If you only need standard torch forecasting without XReg/Flax dependencies, install .[torch]. If you need XReg but not Flax, install .[xreg].

Validate CUDA before starting MCP:

uv run --python .venv-gpu/bin/python python - <<'PY'
import torch
print(torch.__version__)
print(torch.cuda.is_available())
print(torch.cuda.get_device_name(0))
x = torch.randn((512, 512), device="cuda")
y = x @ x
torch.cuda.synchronize()
print("cuda matmul ok")
PY

Start The Server

Start with a conservative batch size. Do not set CUDA_VISIBLE_DEVICES= unless you intentionally want to hide the GPU.

export TIMESFM_BATCH_SIZE=64
export PYTHONNOUSERSITE=1

timesfm-mcp --transport sse --host 0.0.0.0 --port 8765

The SSE endpoint will be:

http://<host>:8765/sse

For local stdio clients:

timesfm-mcp --transport stdio

For streamable HTTP:

timesfm-mcp --transport streamable-http --host 0.0.0.0 --port 8765

Environment Variables

Variable	Default	Meaning
`TIMESFM_MODEL_ID`	`google/timesfm-2.5-200m-pytorch`	Hugging Face model id or local model path.
`TIMESFM_MAX_CONTEXT`	`1024`	Maximum context points used by the compiled model.
`TIMESFM_MAX_HORIZON`	`256`	Maximum forecast horizon.
`TIMESFM_BATCH_SIZE`	`64`	TimesFM `per_core_batch_size`; raise only after memory is stable.
`TIMESFM_NORMALIZE_INPUTS`	`true`	Normalize each input series before forecasting.
`TIMESFM_CONTINUOUS_QUANTILE_HEAD`	`true`	Use continuous quantile head for better bands.
`TIMESFM_FORCE_FLIP_INVARIANCE`	`true`	Enforce sign symmetry.
`TIMESFM_INFER_IS_POSITIVE`	`true`	Clamp positive-only series to nonnegative outputs.
`TIMESFM_FIX_QUANTILE_CROSSING`	`true`	Enforce monotonic quantiles.
`TIMESFM_RETURN_BACKCAST`	`false`	Internal default; XReg requests force this to `true`.
`TIMESFM_TORCH_COMPILE`	`false`	Enable PyTorch compile when loading the model.

Set infer_is_positive=false per request, or TIMESFM_INFER_IS_POSITIVE=false for the process, when the metric can go below zero: returns, residuals, PnL, temperature anomalies, z-scores, or signed deltas.

Agent Workflow

Read guidance or the timesfm://forecasting/guide resource.
Call health. If cuda_probe.passed is false, fix PyTorch/CUDA before loading.
Call estimate_memory for large workloads.
Call warmup once if latency matters.
Use the forecasting tool that matches the input shape.
Interpret quantiles carefully: index 0 is mean, then q10 through q90.

q10 and q90 form the central 80 percent prediction interval. q20 and q80 form the central 60 percent interval.

Tool Examples

Forecast Values

{
  "inputs": [[10, 12, 11, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28, 30, 31, 33, 34, 36, 37, 39, 40, 42, 43, 45, 46, 48, 49, 51, 52, 54, 55]],
  "horizon": 7,
  "names": ["sales"],
  "infer_is_positive": true
}

Forecast With Covariates

Dynamic covariates must have length len(input_series) + horizon for each series. The tail values are known future covariates.

{
  "inputs": [[100, 101, 103, 105, 104, 106, 108, 109, 111, 113, 112, 114, 116, 118, 119, 120, 122, 124, 123, 125, 127, 129, 130, 132, 133, 135, 137, 138, 140, 141, 143, 145]],
  "horizon": 4,
  "dynamic_numerical_covariates": {
    "price": [[9.9, 9.9, 9.8, 9.8, 9.7, 9.7, 9.7, 9.6, 9.6, 9.6, 9.5, 9.5, 9.5, 9.5, 9.4, 9.4, 9.4, 9.3, 9.3, 9.3, 9.2, 9.2, 9.2, 9.1, 9.1, 9.1, 9.0, 9.0, 9.0, 8.9, 8.9, 8.9, 8.8, 8.8, 8.8, 8.8]]
  },
  "static_categorical_covariates": {
    "region": ["seoul"]
  },
  "xreg_mode": "xreg + timesfm"
}

Detect Anomalies

{
  "inputs": [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32]],
  "actuals": [[33, 60]],
  "horizon": 2,
  "names": ["metric"]
}

Severity rules:

normal: actual is inside q20 to q80.
warning: actual is outside q20 to q80 but inside q10 to q90.
critical: actual is outside q10 to q90.

Security

The SSE and streamable HTTP transports do not add authentication by themselves. Bind to 127.0.0.1 for local-only use. Bind to 0.0.0.0 only on a trusted network or behind your own authentication, firewall, or reverse proxy.

Development

uv pip install -e ".[gpu,dev]"
pytest -q
ruff check .

License

Apache-2.0. This repository wraps TimesFM and depends on the upstream timesfm Python package and model weights.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured