timesfm-mcp
Local MCP server for GPU-backed TimesFM 2.5 forecasting, enabling zero-shot time-series forecasting, covariate forecasting, anomaly detection, and CSV forecasting via MCP tools.
README
timesfm-mcp
Local MCP server for GPU-backed TimesFM 2.5 forecasting.
This package exposes Google's TimesFM model to MCP clients over stdio, sse,
or streamable-http. It is intended for local or trusted-network serving where
agents need zero-shot time-series forecasts, prediction intervals, covariate
forecasting, CSV forecasting, and interval-based anomaly scoring.
What It Provides
guidance: agent-facing usage guide for safe TimesFM forecasting.health: package, CUDA, system, and model-state report with a CUDA matmul probe.estimate_memory: rough dataset memory estimate before loading large jobs.warmup: lazy-load and compile TimesFM on the GPU before live requests.forecast_values: forecast in-memory numeric series.forecast_csv: forecast numeric columns in a local CSV and write CSV or JSON.forecast_with_covariates_values: TimesFM 2.5 XReg forecasts with known future covariates.detect_anomalies: compare future actuals against q20/q80 and q10/q90 forecast bands.timesfm://forecasting/guide: MCP resource with the same operational guidance.timesfm_forecasting_guide: MCP prompt for agents before planning a forecast.
The model is a process singleton. It is loaded on first warmup or forecast call
and remains in memory until the MCP server process exits. If a request changes
model settings such as max_context, max_horizon, batch_size, or
infer_is_positive, the server reloads the model with the new settings.
When To Use
Use this server for zero-shot univariate time-series forecasting:
- Sales, demand, revenue, traffic, inventory, and capacity planning.
- Sensor readings, vitals, load, weather, prices, and measurements.
- Probabilistic forecasts where q10 through q90 prediction bands matter.
- Known-future-covariate forecasts, such as price, promotion, holiday, weather, store attributes, product family, or region effects.
- Forecast-vs-actual anomaly review using prediction intervals.
Do not use it for classification, clustering, causal interpretation, coefficient analysis, general tabular prediction, or model fine-tuning. Fine-tuning is a training workflow and is intentionally not exposed by this inference MCP server.
GPU Setup With uv
For RTX 5090 and other new NVIDIA GPUs, install a PyTorch wheel that supports the GPU architecture before installing this package. CUDA 12.8 wheels are the recommended starting point for this machine class.
git clone https://github.com/chokukil/timesfm-mcp.git
cd timesfm-mcp
uv venv .venv-gpu --python 3.10
source .venv-gpu/bin/activate
uv pip install --upgrade --reinstall \
torch torchvision torchaudio \
--index-url https://download.pytorch.org/whl/cu128
uv pip install -e ".[gpu]"
uv pip check
.[gpu] installs the TimesFM torch, XReg, and Flax-related extras, including
einshape. If you only need standard torch forecasting without XReg/Flax
dependencies, install .[torch]. If you need XReg but not Flax, install
.[xreg].
Validate CUDA before starting MCP:
uv run --python .venv-gpu/bin/python python - <<'PY'
import torch
print(torch.__version__)
print(torch.cuda.is_available())
print(torch.cuda.get_device_name(0))
x = torch.randn((512, 512), device="cuda")
y = x @ x
torch.cuda.synchronize()
print("cuda matmul ok")
PY
Start The Server
Start with a conservative batch size. Do not set CUDA_VISIBLE_DEVICES= unless
you intentionally want to hide the GPU.
export TIMESFM_BATCH_SIZE=64
export PYTHONNOUSERSITE=1
timesfm-mcp --transport sse --host 0.0.0.0 --port 8765
The SSE endpoint will be:
http://<host>:8765/sse
For local stdio clients:
timesfm-mcp --transport stdio
For streamable HTTP:
timesfm-mcp --transport streamable-http --host 0.0.0.0 --port 8765
Environment Variables
| Variable | Default | Meaning |
|---|---|---|
TIMESFM_MODEL_ID |
google/timesfm-2.5-200m-pytorch |
Hugging Face model id or local model path. |
TIMESFM_MAX_CONTEXT |
1024 |
Maximum context points used by the compiled model. |
TIMESFM_MAX_HORIZON |
256 |
Maximum forecast horizon. |
TIMESFM_BATCH_SIZE |
64 |
TimesFM per_core_batch_size; raise only after memory is stable. |
TIMESFM_NORMALIZE_INPUTS |
true |
Normalize each input series before forecasting. |
TIMESFM_CONTINUOUS_QUANTILE_HEAD |
true |
Use continuous quantile head for better bands. |
TIMESFM_FORCE_FLIP_INVARIANCE |
true |
Enforce sign symmetry. |
TIMESFM_INFER_IS_POSITIVE |
true |
Clamp positive-only series to nonnegative outputs. |
TIMESFM_FIX_QUANTILE_CROSSING |
true |
Enforce monotonic quantiles. |
TIMESFM_RETURN_BACKCAST |
false |
Internal default; XReg requests force this to true. |
TIMESFM_TORCH_COMPILE |
false |
Enable PyTorch compile when loading the model. |
Set infer_is_positive=false per request, or TIMESFM_INFER_IS_POSITIVE=false
for the process, when the metric can go below zero: returns, residuals, PnL,
temperature anomalies, z-scores, or signed deltas.
Agent Workflow
- Read
guidanceor thetimesfm://forecasting/guideresource. - Call
health. Ifcuda_probe.passedis false, fix PyTorch/CUDA before loading. - Call
estimate_memoryfor large workloads. - Call
warmuponce if latency matters. - Use the forecasting tool that matches the input shape.
- Interpret quantiles carefully: index 0 is mean, then q10 through q90.
q10 and q90 form the central 80 percent prediction interval. q20 and
q80 form the central 60 percent interval.
Tool Examples
Forecast Values
{
"inputs": [[10, 12, 11, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28, 30, 31, 33, 34, 36, 37, 39, 40, 42, 43, 45, 46, 48, 49, 51, 52, 54, 55]],
"horizon": 7,
"names": ["sales"],
"infer_is_positive": true
}
Forecast With Covariates
Dynamic covariates must have length len(input_series) + horizon for each
series. The tail values are known future covariates.
{
"inputs": [[100, 101, 103, 105, 104, 106, 108, 109, 111, 113, 112, 114, 116, 118, 119, 120, 122, 124, 123, 125, 127, 129, 130, 132, 133, 135, 137, 138, 140, 141, 143, 145]],
"horizon": 4,
"dynamic_numerical_covariates": {
"price": [[9.9, 9.9, 9.8, 9.8, 9.7, 9.7, 9.7, 9.6, 9.6, 9.6, 9.5, 9.5, 9.5, 9.5, 9.4, 9.4, 9.4, 9.3, 9.3, 9.3, 9.2, 9.2, 9.2, 9.1, 9.1, 9.1, 9.0, 9.0, 9.0, 8.9, 8.9, 8.9, 8.8, 8.8, 8.8, 8.8]]
},
"static_categorical_covariates": {
"region": ["seoul"]
},
"xreg_mode": "xreg + timesfm"
}
Detect Anomalies
{
"inputs": [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32]],
"actuals": [[33, 60]],
"horizon": 2,
"names": ["metric"]
}
Severity rules:
normal: actual is inside q20 to q80.warning: actual is outside q20 to q80 but inside q10 to q90.critical: actual is outside q10 to q90.
Security
The SSE and streamable HTTP transports do not add authentication by themselves.
Bind to 127.0.0.1 for local-only use. Bind to 0.0.0.0 only on a trusted
network or behind your own authentication, firewall, or reverse proxy.
Development
uv pip install -e ".[gpu,dev]"
pytest -q
ruff check .
License
Apache-2.0. This repository wraps TimesFM and depends on the upstream
timesfm Python package and model weights.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.