GVNR
Budget Governor gives autonomous AI agents a hard spend limit. One budget_clear call before each LLM request checks the agent's envelope and deducts the estimated cost. Stops runaway billing before it starts. No self-hosting required.
README
Gvnr
Substrate primitives for AI agents — spend caps, rate limits, idempotency, post-call reconciliation, human approval bridges. One MCP endpoint, one credit pool, no infrastructure to deploy.
Listed on the Official MCP Registry as dev.gvnr/gvnr.
No deployment. No proxy. No self-hosting.
The problem
Agents cost 10–12x more than estimated in production. System prompts, retry loops, and tool calls multiply fast. A runaway agent can generate a $47,000 bill in 11 days. The common fix — self-hosting LiteLLM — requires running infrastructure most developers won't set up.
Gvnr is the hosted alternative: an external authority your agent checks before spending.
How it works
- Your agent calls
budget_clear(MCP tool or REST) before each LLM request - The governor checks your account credit balance and the agent's spend envelope
- It returns
{ approved: true }or{ approved: false, reason: "..." } - If denied, your agent skips the call
The envelope is configured by you (per-agent daily or session cap). The credit balance is topped up via USDC on Base.
Quick start
1. Provision an account
curl -X POST https://gvnr.dev/v1/account
# { "api_key": "bg_...", "account_id": "..." }
2. Top up credits
Open the payment page for your chosen pack, pass your API key as a query param:
https://gvnr.dev/pay/starter?api_key=bg_YOUR_KEY
Send USDC on Base to the address shown, paste your tx hash — credits are added after on-chain verification.
Or, if you prefer the programmatic path — POST the tx hash directly:
curl -X POST \
-H "Authorization: Bearer bg_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"tx_hash":"0x..."}' \
https://gvnr.dev/v1/account/topup-verify/starter
3. Set a spend envelope for your agent
curl -X PUT \
-H "Authorization: Bearer bg_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"agent_id":"my-agent","limit_usd":5,"window":"daily"}' \
https://gvnr.dev/v1/budget/envelope
# { "success": true, "agent_id": "my-agent", "limit_usd": 5, "window": "daily" }
4. Set a rate envelope per (agent, provider, model)
curl -X PUT \
-H "Authorization: Bearer bg_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"agent_id":"my-agent","provider":"anthropic","model":"claude-sonnet-4-6","requests_per_minute":30}' \
https://gvnr.dev/v1/rate/envelope
# { "success": true, ... }
5. Before each LLM request: budget_clear then rate_check
curl -X POST \
-H "Authorization: Bearer bg_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"agent_id":"my-agent","model":"claude-sonnet-4-6","estimated_tokens":2000}' \
https://gvnr.dev/v1/budget/clear
# { "approved": true, "remaining_usd": 4.994 }
curl -X POST \
-H "Authorization: Bearer bg_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"agent_id":"my-agent","provider":"anthropic","model":"claude-sonnet-4-6"}' \
https://gvnr.dev/v1/rate/check
# { "allowed": true, "requests_remaining_this_minute": 29 }
6. (optional) Dedupe retries with idempotency_check
curl -X POST \
-H "Authorization: Bearer bg_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"key":"job-abc-123","ttl_seconds":3600}' \
https://gvnr.dev/v1/idempotency/check
# First call: { "is_first_call": true, "ttl_remaining_seconds": 3600 }
# Replay: { "is_first_call": false, "ttl_remaining_seconds": 3598 }
7. After the LLM responds, reconcile against actual usage
curl -X POST \
-H "Authorization: Bearer bg_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"agent_id":"my-agent","actual_input_tokens":1800,"actual_output_tokens":2400}' \
https://gvnr.dev/v1/budget/reconcile
# { "ok": true, "drift_usd": 0.003, "remaining_usd": 4.991, "balance_usd": 9.991 }
reconcile is optional but keeps the envelope honest — Anthropic, OpenAI, and Gemini all return usage fields with the actual token counts; pass those in.
MCP setup
Add to Claude Desktop or any MCP-compatible client:
https://gvnr.dev/mcp?api_key=bg_YOUR_KEY
Claude Code
claude mcp add gvnr --transport http \
"https://gvnr.dev/mcp?api_key=bg_YOUR_KEY"
MCP tools
| Tool | Description |
|---|---|
budget_clear(agent_id, model, estimated_tokens) |
Check clearance and deduct estimated cost |
set_envelope(agent_id, limit_usd, window?) |
Create or update an agent's spend envelope |
get_balance() |
Get current account credit balance |
reconcile(agent_id, actual_input_tokens, actual_output_tokens) |
Apply the drift between estimated and actual cost after the LLM responds |
set_rate_envelope(agent_id, provider, model, requests_per_minute) |
Allocate a per-(agent, provider, model) rate share |
rate_check(agent_id, provider, model) |
Approve or deny based on the rate envelope; returns retry_after_ms on denial |
idempotency_check(key, ttl_seconds?) |
Dedupe retries on a caller-supplied key; returns is_first_call |
REST API
All endpoints (except POST /v1/account) require Authorization: Bearer bg_YOUR_KEY.
Account
| Method | Path | Description |
|---|---|---|
POST |
/v1/account |
Provision account — returns api_key |
GET |
/v1/account/balance |
Current credit balance |
GET |
/v1/packs/:pack/info |
Public — pack details, USDC address, raw amount |
POST |
/v1/account/topup-verify/:pack |
Submit tx hash → verify on-chain → credit account |
POST |
/v1/account/topup/:pack |
x402-gated credit top-up (machine clients) |
Budget
| Method | Path | Description |
|---|---|---|
POST |
/v1/budget/clear |
Clearance call — approve or deny |
PUT |
/v1/budget/envelope |
Create or update agent envelope |
GET |
/v1/budget/envelope/:agent_id |
Read envelope state |
Clearance response
{ "approved": true, "remaining_usd": 4.994 }
{ "approved": false, "remaining_usd": 0, "reason": "envelope_exceeded" }
Denial reasons: no_credits · no_envelope · envelope_exceeded
Credit packs
Top up at GET /pay/:pack?api_key=bg_YOUR_KEY. Send USDC on Base mainnet — credits added after on-chain verification.
| Pack | Price | Clearances | Link |
|---|---|---|---|
starter |
$19 | ~10k/month | /pay/starter |
growth |
$39 | ~30k/month | /pay/growth |
studio |
$79 | ~100k/month | /pay/studio |
Envelope windows
daily— resets at UTC midnight each daysession— never resets (use for one-shot tasks)
Supported models
Model pricing is a static lookup on the hot path — no external calls. Includes claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5, gpt-4o, gpt-4o-mini, and others. Embedding models (text-embedding-3-small/large, gemini-embedding-001/2) are billed input-only — pass input tokens to budget_clear for those. Unknown models fall back to a conservative default.
Network
X402_NETWORK |
Chain | Notes |
|---|---|---|
eip155:84532 |
Base Sepolia | Testnet — safe for development |
eip155:8453 |
Base mainnet | Real USDC |
Current deployment: Base mainnet.
License
MIT — see LICENSE.
The canonical hosted service is at https://gvnr.dev. Self-hosted instances are unaffiliated.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.