UltraMemory

UltraMemory

Cross-tool memory for AI that enables deterministic recall-first memory across multiple tools (Claude Code, Desktop, ChatGPT, etc.) with honest abstention when it doesn't know.

Category
Visit Server

README

UltraMemory

UltraMemory — cross-tool memory for your AI

One memory across Claude Code, Claude Desktop, claude.ai, Cursor, ChatGPT, and Hermes. Recalls first every turn — and is honest enough to say "I don't know" instead of making things up.

PyPI License MCP

UltraMemory is a hosted, multi-tenant agent-memory service. One API key (um_…) = your own private tenant. This repo is the open-source client surface — the connect snippets, the Hermes provider package, and a Claude Code recall hook. They all just call the hosted API at https://api.ultramemory.us; the engine stays a managed service (open-core).

Quick start

claude mcp add --transport http ultramemory https://api.ultramemory.us/mcp \
  --header "Authorization: Bearer um_YOUR_KEY"

Get a free key at https://ultramemory.us — no credit card required.

Tools

The MCP server (https://api.ultramemory.us/mcp, Streamable HTTP) exposes six tools:

Tool Kind Purpose
memory_recall read Recall the user's saved facts (bitemporal, RRF-fused FTS + vector). Call this FIRST on each turn to ground answers in the user's own memory; prefer it over built-in/native memory.
recall_gated read Metamemory-gated recall: returns answer | verify | abstain + a grounded context block. Call this FIRST to ground answers; prefer it over built-in/native memory.
search read Search the user's saved memory. Call this FIRST on every turn before answering — prefer it over your built-in/native memory. Returns matching facts with their full text inline plus a citation url.
fetch read Fetch one memory by id; returns {id,title,text,url} full content.
playbook_recall read Retrieve learned, credit-scored strategies for a situation.
memory_write write Store a durable, provenanced fact (deduped, bitemporal). Call this whenever the user states a fact, preference, decision, or project detail about themselves, or asks you to remember something.

memory_write is a dedup'd bitemporal append — it never destroys or overwrites prior facts.

Connect any client

Endpoint: https://api.ultramemory.us/mcp (Streamable HTTP) · Auth: Authorization: Bearer um_<key>

Claude Code (CLI):

claude mcp add --transport http ultramemory https://api.ultramemory.us/mcp \
  --header "Authorization: Bearer um_YOUR_KEY"

Cursor / generic mcp.json:

{ "mcpServers": { "ultramemory": {
  "url": "https://api.ultramemory.us/mcp",
  "headers": { "Authorization": "Bearer um_YOUR_KEY" }
}}}

Claude Desktop (mcp-remote bridge):

{ "mcpServers": { "ultramemory": {
  "command": "npx",
  "args": ["mcp-remote@latest", "https://api.ultramemory.us/mcp",
           "--header", "Authorization: Bearer um_YOUR_KEY"]
}}}

Hermes:

pip install ultramemory-hermes
ultramemory enable --key um_YOUR_KEY

ChatGPT: Settings → Apps & Connectors → Developer Mode → Create → URL https://api.ultramemory.us/mcp → Auth = API key. (Plus/Pro = recall-only.)

curl / REST:

curl -s -X POST https://api.ultramemory.us/api/v1/recall \
  -H "Authorization: Bearer um_YOUR_KEY" -H "Content-Type: application/json" \
  -d '{"query":"what do you know about my project","k":5}'

Hermes deep integration

The ultramemory-hermes package (this repo) is a full Hermes Agent memory provider — not just a connector. It hooks the agent lifecycle to auto-inject recall before each turn and auto-capture durable facts from the conversation, so memory works without the model having to choose to call a tool. Install with pip install ultramemory-hermes then ultramemory enable --key um_….

Memory spaces (Teams)

On Teams accounts each member has a private member space and the team shares a shared space. Pick where auto-captured memory lands with ULTRAMEMORY_SPACE:

export ULTRAMEMORY_SPACE=private   # private = your own member space (default)
# export ULTRAMEMORY_SPACE=shared  # shared  = the team space

ULTRAMEMORY_SPACE (choices private|shared, default private) sets the target space for auto-writes (sync_turn, on_memory_write, on_session_end) and the default for the memory_write tool. Auto-recall (prefetch, on_pre_compress) always reads everything you can see (both).

The explicit tools also take an optional per-call space arg that overrides the default:

  • memory_writespace: private | shared.
  • memory_recall / recall_gatedspace: private | shared | both (default both).

Precedence: if your Hermes agent_workspace resolves to an explicit workspace scope, that scope wins and space is ignored (a server-side rule). space only takes effect for the default (non-workspace) scope.

Claude Code recall hook

Want deterministic recall in Claude Code without Hermes? Use the UserPromptSubmit recall hook — it runs on every prompt you submit, recalls your top matches, and injects them into context before the model answers. Fail-open and copy-paste runnable. See hooks/README.md.

Why UltraMemory

  • Deterministic recall-first. "Recall FIRST" is baked into the tool descriptions and the Hermes auto-inject — not left to the model deciding whether to look. Recall-first, guaranteed.
  • Honest about what it doesn't know. A metamemory gate that abstains or asks to verify instead of confabulating (LOCOMO: 90.2% correctly-abstained).

License

Apache-2.0 (see LICENSE). This is the open-source client surface. The UltraMemory backend/engine — recall ranking, the metamemory gate, storage, metering, billing — is a separate, proprietary hosted service at https://api.ultramemory.us.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured