ai-memory

ai-memory

Turn editor chat history (Cursor/Claude Code/Windsurf/Copilot/Codex CLI) into typed Markdown memories + AGENTS.md + Cursor Rules via MCP. Local-first, git-trackable.

Category
Visit Server

README

ai-memory

npm version CI License: MIT

Turn AI editor chat history into typed Markdown + AGENTS.md rules — local-first, git-trackable, zero .remember() calls.

ai-memory in 30 seconds

npx ai-memory-cli extract                     # read your editor's chat history → typed Markdown
npx ai-memory-cli rules --target agents-md    # → AGENTS.md (Cursor / Claude / Windsurf / Copilot / Codex all read it)
npx ai-memory-cli recall "OAuth"              # show the full git lineage of any decision
npx ai-memory-cli context --copy              # resume any session with full context

Every other "AI memory" tool starts with a remember() API and asks you to instrument your code. ai-memory reads your editor's chat history directly — Cursor, Claude Code, Windsurf, Copilot Chat, Codex CLI — and turns it into typed, git-trackable Markdown that every AI editor reads back via AGENTS.md. No new API surface to learn, no runtime memory store to keep alive between sessions.

Local-first by default. Conversations never leave your machine; the only network call is to whichever LLM provider you've configured for extraction. Or use Ollama / LM Studio for fully offline operation.

中文文档


What only ai-memory does

Four things you won't find together anywhere else. The first three are structural; the fourth is engineering investment nobody else is making.

  1. Zero .remember() boilerplate. We read what you've already written — the Cursor / Claude Code / Windsurf / Copilot Chat / Codex CLI transcripts that already live on your disk. No SDK to import, no runtime memory store to keep alive. Compare with mem0 / Letta / Zep / cortexmem, which require client.add(...) calls from your application code.

  2. Native AGENTS.md output. ai-memory rules --target agents-md writes the cross-tool standard rules file that Cursor, Claude Code, Windsurf, Copilot, and OpenAI Codex CLI all consume. The merge is idempotent: only the section between <!-- ai-memory:managed-section start --> ... end --> is touched; any hand-written content in your AGENTS.md is preserved byte-for-byte. AGENTS.md adoption crossed 60K repos and is now under Linux Foundation stewardship — most projects hand-write theirs from scratch; we generate it from your conversation history.

  3. Plain Markdown in git — no database. .ai-memory/ is the source of truth: Markdown files you git diff, code-review, branch, and revert. Other tools that advertise "git-trackable" memory ship git-tracked snapshots of their internal store; we ship the human-readable file format and let git own everything. Cross-machine sync is git pull.

  4. Time-travel recall via git history. ai-memory recall <query> shows the full commit-by-commit lineage of every memory: what the decision said on April 1, what it said on April 15, what changed and who changed it. Every other memory tool returns "the latest" only — superseded versions are silently overwritten. No new runtime dep: recall shells out to your existing git with a 10-second timeout.

We measure ourselves

CCEB — Cursor Conversation Extraction Benchmark, gpt-4o-mini, 30 hand-curated fixtures (v1.1 expansion):

Metric v1.1 (2026-04-27, 30 fixtures) v1.0 / v2.5-01 (2026-04-26, 9 fixtures) v2.4 (2026-04-25, 9 fixtures)
Overall F1 64.1% (P 56.8% / R 73.5%) 76.2% (P 66.7% / R 88.9%) 56.0% (P 43.8% / R 77.8%)
decision / issue F1 78.3% / 100% 75.0% / 100% 66.7% / 66.7%
architecture F1 72.7% (recall the new bottleneck) 100% 50%
Noise rejection (chit-chat / deferred / hypothetical) 100% — no hallucinated memories on any of the 4 noise fixtures 100% (2 fixtures) 100% (2 fixtures)
Wall-clock 239.7 s 47.9 s 70.5 s
Spend ≈ $0.02 ≈ $0.006 ≈ $0.005

The v1.1 expansion (cceb-001 — cceb-030) deliberately added harder cases v1.0 didn't exercise: multi-memory-per-conversation (architecture + convention together), commitment-shape ambiguity (process vs. technical TODOs), CJK/mixed-language conversations, and decision-impact-vs-followup-TODO triage. F1 dropped 12 pp from the 9-fixture row above; that's not a model regression — running the v1.0 fixtures alone against the same prompt still scores 76%. The 64% is the more honest measurement of the same extractor on a less cherry-picked fixture distribution. The biggest remaining lever is todo precision (11 of the 19 false positives are TODOs); per the baseline-doc analysis the next move is a post-extract pairwise-content dedup pass, tracked for v2.6.

Sample misses, sample false positives, the per-fixture detail, the v1.0 → v1.1 delta analysis, and the methodology are all in the baseline doc. We'd rather publish numbers we can defend on cross-examination than shop a leaderboard score that drifts the moment the upstream model updates.

LongMemEval-50 (cross-corpus sanity check, bench/longmemeval/): on a deterministic 50-question subset of LongMemEval-S-cleaned, our literal-token evidence-preservation rubric scores 0 / 50 full + 2 / 50 partial with gpt-4o-mini (~12 min, ~$0.40). This is a deliberately strict proxy ("did every key token of the upstream answer survive into our extracted memories?", not LongMemEval native QA correctness — see the spike doc §4.3 for the rubric); 0/50 says ai-memory is not pointed at open-domain QA over a 500-turn haystack, and the per-question matched/total counts in the baseline doc show where partial signal does land (single-session-preference: 3-6 of 17-43 tokens consistently). LongMemEval, LoCoMo, et al. measure runtime recall (did the agent remember a fact); we measure extraction (did we get the right structured artefact out of the chat). Different layer, different question — see also the category-positioning ADR.

Other things it handles

  • Token savingscontext compresses thousands of turns into a focused prompt (typically 90%+ reduction vs. pasting raw history).
  • Team-aware — per-author subdirectories under .ai-memory/{author}/, no merge conflicts when two people commit memories from the same project.
  • Cross-device portableexport / import round-trip the whole store as a versioned JSON bundle.
  • Zero confignpx ai-memory-cli init --with-mcp and you're done.

FAQ

"Doesn't 1M-token context obsolete you?"

Short answer: long context and ai-memory solve different parts of the same problem. 1M-token windows let the model see a long conversation in one query; ai-memory makes that conversation's decisions persistent, reviewable, and shareable across sessions, machines, and teammates. We answer the question seriously below because it's the most-cited objection on HN to any structured-memory tool.

Cost compounds when you re-ship history every query. Frontier input pricing as of 2026-04 sits at ~$1–$3 per 1M tokens (Anthropic / OpenAI / Google AI). A two-week Cursor session reliably runs 100–300K tokens once tool-call payloads and file diffs are included; pasting that into every turn costs $0.20–$0.60 per query before you've asked anything. An AGENTS.md generated from the same conversation is on the order of 1–5K tokens, loaded once per session. Multiply by your team size and queries-per-day; the gap is two orders of magnitude.

Long-context retrieval still degrades on non-headline information. "Lost in the middle" (Liu et al. 2023) and needle-in-haystack at 1M scale (BABILong, Kuratov et al. 2024) both show measurable recall drop on multi-hop retrieval past ~128–256K tokens, even on models that advertise 1M-token windows. Long context works well for the most-recent and most-prominent turns; it degrades on the everyday "wait, what did we decide about X three weeks ago?" question — exactly the queries memory tools are designed for. Extraction is lossless on the only signal that matters (the typed decision / convention / architecture).

Long context is per-machine; AGENTS.md is per-repo. Your laptop's chat history doesn't help your teammate's first day. A .ai-memory/ directory committed to git does — it's reviewable in PRs, branchable, revertable, and re-readable by every editor on every machine that clones the repo. See What only ai-memory does — points 3 and 4 are the long form.

We'll re-spike this FAQ if (a) sub-$0.50/M frontier pricing ships, (b) long-context benchmarks show <5% retrieval degradation past 500K, or (c) editors start shipping native cross-session conversation compression. Trigger list and full reasoning are in docs/1m-context-faq-spike-2026-04-27.md.


Quick Start

# 30-second demo — no API key required.
# Bootstraps a 3-memory hand-curated store in a tmp dir and prints the
# AGENTS.md it generates (the file Cursor / Codex / Windsurf / Copilot all
# read at session start). Cleans up afterwards.
npx ai-memory-cli try

# Set up API key (any OpenAI-compatible provider)
export AI_REVIEW_API_KEY=sk-...    # or OPENAI_API_KEY

# Initialise project (optionally register ai-memory as an MCP server)
npx ai-memory-cli init --with-mcp

# One-shot health check — verifies editors, API key, store, MCP config
npx ai-memory-cli doctor

# Extract knowledge from all conversations
npx ai-memory-cli extract

# Search your knowledge base
npx ai-memory-cli search "authentication"

# Generate Cursor Rules from conventions
npx ai-memory-cli rules

# Generate a context prompt and copy to clipboard
npx ai-memory-cli context --copy

# Commit to git
git add .ai-memory/ && git commit -m "chore: add ai-memory knowledge base"

Commands

try — No-API-key demo (30 seconds, zero credentials)

Bootstraps a hand-curated 3-memory store in a tmp dir, runs the real rules --target agents-md pipeline against it, and prints the generated AGENTS.md inline. No LLM call, no API key, no changes to your working directory — just a concrete answer to "what does this thing actually produce?" before you commit to setup.

npx ai-memory-cli try                     # full demo, tmp dir cleaned up afterwards
npx ai-memory-cli try --keep              # leave the tmp scenario on disk for inspection
npx ai-memory-cli try --json              # structured output (counts, AGENTS.md content, paths)

The bundled scenario contains 1 decision (PKCE auth flow), 1 architecture record (event-sourced billing audit log), and 1 convention (Relay-style cursor pagination) across two authors. Only conventions and decisions land in AGENTS.md — the same filter the real rules command uses on your own memories.

doctor — One-shot health check

Run this after try if you decide to set ai-memory up against your real chat history. It diagnoses the six most common setup problems and tells you exactly how to fix each one.

npx ai-memory-cli doctor                 # human-readable report
npx ai-memory-cli doctor --no-llm-check  # skip live API call (offline / CI)
npx ai-memory-cli doctor --json          # structured output for automation / bug reports

Checks cover: Node.js version, detected editors (Cursor / Claude Code / Windsurf / Copilot / Codex CLI + conversation counts), LLM provider + live connectivity probe, memory store + author resolution, embeddings freshness, and MCP config registration. Exit code is 0 if everything passes, 1 if any check fails. When no API key is configured, doctor now points at try as the no-key fast path.

list — Show available conversations

npx ai-memory-cli list                            # show all conversations
npx ai-memory-cli list --source cursor             # only Cursor conversations
npx ai-memory-cli list --json                      # JSON output

extract — Extract memories from conversations

npx ai-memory-cli extract                          # extract all conversations
npx ai-memory-cli extract --incremental             # only new conversations
npx ai-memory-cli extract --pick 3                  # only conversation #3
npx ai-memory-cli extract --pick 1,4,7              # multiple conversations
npx ai-memory-cli extract --id b5677be8             # match by ID prefix
npx ai-memory-cli extract --since "3 days ago"      # only recent conversations
npx ai-memory-cli extract --type decision,todo      # only specific types
npx ai-memory-cli extract --dry-run                 # preview without writing
npx ai-memory-cli extract --force                   # overwrite existing files
npx ai-memory-cli extract --author "alice"          # override author name
npx ai-memory-cli extract --redact                  # scrub secrets / PII before LLM call (v2.5+)
npx ai-memory-cli extract --verbose                 # show LLM request details
npx ai-memory-cli extract --json                    # JSON output (CI friendly)

--redact — scrub secrets / PII / internal hostnames before sending to the LLM (v2.5+)

extract, summary, and context --summarize ship conversation excerpts to your configured LLM provider. "Local-first" applies to the storage layer.ai-memory/ is plain Markdown that we never upload — but the extraction call is necessarily an outbound HTTPS request. If your chat history contains accidentally pasted API keys, internal hostnames in stack traces, or customer email addresses in logs, --redact scrubs them before the request leaves your machine.

$ ai-memory extract --redact
   ...
Redaction: 5 items scrubbed before LLM (118 chars) — 3 openai-key, 2 email

Default-on rules (10): OpenAI / Anthropic / AWS / GitHub / Slack / GCP / Stripe API keys, RFC5322 emails, and *.internal / *.corp / *.local / *.lan / *.intra hostnames. Two opt-in rules (jwt, aws-secret-key) are off by default because they have high false-positive rates against long base64 strings; enable them via .ai-memory/.config.json:

{
  "redact": {
    "enabled": true,
    "enableOptional": ["jwt"],
    "rules": [{ "name": "internal-jira", "pattern": "JIRA-[0-9]{4,}" }]
  }
}

CLI overrides config: --no-redact always disables, --redact always enables. The audit trail (per-rule hit counts) is always on when redaction is on, in both human and --json output. The matched value is never logged — that would defeat the purpose.

Threat model. Defense-in-depth, not a substitute for proper secrets management. The full policy doc — including out-of-scope items (image attachments, retroactive scrubbing of pre-existing memories, structured-PII vault inspection) and the threat-model boundaries — lives at docs/redaction-policy-2026-04-26.md.

search — Search through extracted memories

npx ai-memory-cli search "OAuth"                   # keyword search across all memories
npx ai-memory-cli search "payment" --type decision  # filter by type
npx ai-memory-cli search "auth" --author alice      # filter by author
npx ai-memory-cli search "API" --include-resolved   # include resolved memories
npx ai-memory-cli search "config" --json            # JSON output

Results are ranked by relevance (title matches > content > context) with highlighted keywords.

recall — Time-travel a memory through git history

Every other "memory" tool flattens its store down to "the latest" — every superseded version is silently overwritten. Because .ai-memory/ is plain Markdown in a git repo, the full lineage of every fact is already on disk; recall exposes it as a first-class command.

npx ai-memory-cli recall "OAuth"                   # show how the OAuth decision evolved
npx ai-memory-cli recall "OAuth" --include-resolved # include superseded / resolved memories
npx ai-memory-cli recall "API" --type decision      # filter by type
npx ai-memory-cli recall "auth" --all-authors       # search across the whole team
npx ai-memory-cli recall "OAuth" --json             # structured output (one entry per memory + its commit list)

Output looks like:

Recall: "OAuth" — 1 memory, 4 commits of lineage

[+] CURRENT  Use OAuth 2.0 PKCE for SPA  @conor (2026-04-20)
    .ai-memory/conor/decisions/2026-04-20-use-oauth-pkce.md
    History (4 commits):
      a1b2c3d  2026-04-20  conor   ~ Tighten OAuth PKCE: require HTTPS-only token endpoint
      e4f5g6h  2026-04-15  conor   ~ Switch from implicit flow to PKCE
      i7j8k9l  2026-03-20  conor   + Add OAuth library notes
    > git log --follow .ai-memory/conor/decisions/2026-04-20-use-oauth-pkce.md  for full diffs
  • Uses git log --follow so renames inside .ai-memory/ are tracked transparently.
  • Each line shows short SHA, ISO date, author, status code (+ added, ~ modified, - deleted, R renamed), and commit subject.
  • Soft fallback — outside a git repo, or before the first commit of .ai-memory/, recall still returns the matching memories with a hint explaining what's missing. There is no scenario where recall is worse than search.
  • No new runtime dep — pure node:child_process.execFile against your existing git with bounded 10s timeouts.

rules — Export conventions as Cursor Rules, AGENTS.md, and Anthropic Skills

Generate editor rules that every AI tool reads natively:

npx ai-memory-cli rules                            # default: .cursor/rules/ai-memory-conventions.mdc
npx ai-memory-cli rules --target agents-md         # AGENTS.md (Codex / Cursor / Windsurf / Copilot / Amp)
npx ai-memory-cli rules --target skills            # Anthropic Skills (Claude Code) — v2.5+
npx ai-memory-cli rules --target both              # write Cursor Rules + AGENTS.md at default paths
npx ai-memory-cli rules --output my-rules.mdc      # custom output (single-target only)
npx ai-memory-cli rules --all-authors              # include team conventions

--target agents-md performs an idempotent merge: only the section between <!-- ai-memory:managed-section start --> ... end --> is touched, so any hand-written content in your AGENTS.md is preserved byte-for-byte. Re-running with no new memories is a no-op (already-up-to-date); malformed markers from a partial edit are reported as a conflict and the file is left untouched.

--target skills writes Anthropic Skills under .claude/skills/. Three skills get generated, one per long-lived memory type:

Skill Source What it tells Claude
.claude/skills/ai-memory-coding-conventions/SKILL.md convention memories When writing new code / naming things / designing APIs
.claude/skills/ai-memory-decision-log/SKILL.md decision memories (status ≠ resolved) When proposing architectural changes / asked why a choice was made
.claude/skills/ai-memory-system-architecture/SKILL.md architecture memories When implementing cross-component features / debugging integration

Skills are loaded dynamically by Claude Code based on the YAML frontmatter description matching your request — unlike AGENTS.md (always-on context), the body only enters context when relevant. The schema we target (frozen 2026-04-26) lives at docs/skills-schema-snapshot-2026-04-26.md. The ai-memory- prefix on skill names is an ownership signal: anything inside .claude/skills/ai-memory-*/ is fully regenerated every run; user-authored skills under any other directory name are left alone.

This is the conversation-to-rules pipeline — extract conventions from chat history, auto-generate the rules files every AI editor reads. No other tool emits all three of Cursor Rules + AGENTS.md + Anthropic Skills from a single chat-history input.

resolve — Mark memories as resolved

Decisions get overturned. TODOs get completed. Keep your knowledge base fresh:

npx ai-memory-cli resolve "OAuth"                  # mark matching memories as resolved
npx ai-memory-cli resolve "OAuth" --undo           # reactivate resolved memories

Resolved memories are automatically excluded from context, summary, and search results. Use --include-resolved to force inclusion.

summary — Generate a project-level summary

npx ai-memory-cli summary                          # write/update SUMMARY.md
npx ai-memory-cli summary --output MEMORY.md       # custom output path
npx ai-memory-cli summary --focus "payment module"  # focus on a topic
npx ai-memory-cli summary --all-authors             # include all team members
npx ai-memory-cli summary --include-resolved        # include resolved memories

# Scope summary to a single conversation (no LLM cost for the wrong chats)
npx ai-memory-cli summary --list-sources           # list conversations first
npx ai-memory-cli summary --source-id b5677be8     # summarize ONE chat
npx ai-memory-cli summary --convo "payment refactor"

context — Generate a continuation prompt

For seamlessly resuming work in a new conversation or on another machine:

npx ai-memory-cli context                          # generate context block (instant, no LLM)
npx ai-memory-cli context --copy                   # generate and copy to clipboard
npx ai-memory-cli context --topic "coupon system"  # focus on a specific topic
npx ai-memory-cli context --recent 7               # only last 7 days of memories
npx ai-memory-cli context --output CONTEXT.md      # write to file
npx ai-memory-cli context --summarize              # use LLM for condensed prose summary
npx ai-memory-cli context --all-authors            # include all team members
npx ai-memory-cli context --include-resolved       # include resolved memories

Scope context to a single conversation — because in real life you usually want to resume one chat, not dump everything:

# 1. See which conversations produced memories
npx ai-memory-cli context --list-sources
#  #  Date        Source        ID        Count  Types              Title
#  ------------------------------------------------------------------------------
#   1  2026-04-01  cursor        b5677be8    12  D:4 A:3 C:5        resume tool
#   2  2026-03-28  claude-code   ff12abc3     7  A:4 T:3            ai-lab

# 2. Copy context from ONE conversation (ID prefix, like git short hash)
npx ai-memory-cli context --source-id b5677be8 --copy

# 3. Or match by conversation title — picks the most recent if multiple match
npx ai-memory-cli context --convo "resume tool" --copy
npx ai-memory-cli context --convo "resume" --all-matching --copy   # include every "resume*" chat

link — Link memories to the commits that implement them (v2.6)

npx ai-memory-cli link                             # scan last 30 days of commits
npx ai-memory-cli link --since "7 days ago"        # custom time window
npx ai-memory-cli link --dry-run                   # preview links without writing
npx ai-memory-cli link --clear-auto                # remove all auto-generated links

Scans your git log and scores each (memory, commit) pair using weighted token overlap: memory title × 3, type × 2, content × 1 vs commit subject × 3, changed paths × 2, body × 1. High-confidence matches (score ≥ 0.70) are written into the memory file as an invisible HTML comment block that the dashboard can surface. The default threshold is conservative — a bad auto-link is worse than no link. Use --dry-run first on a real repo to calibrate.

init — Initialize configuration

npx ai-memory-cli init                             # detect editors, create config
npx ai-memory-cli init --with-mcp                  # also register ai-memory as MCP server
npx ai-memory-cli init --schedule                  # register a daily extract --incremental cron job
npx ai-memory-cli init --unschedule                # remove the scheduled task

With --with-mcp, ai-memory writes / merges .cursor/mcp.json and .windsurf/mcp.json so your editor picks up the MCP server automatically. With --schedule, a daily extraction job is registered with the OS-native scheduler (launchd on macOS, crontab on Linux, Task Scheduler on Windows) — so your knowledge base stays fresh without any manual runs. Both flags are idempotent and safe.

export / import — Move memories between machines (NEW)

Cursor / Claude Code conversations live in each machine's local state, so a new laptop starts with no history. export and import create a portable JSON bundle that round-trips cleanly — same files, same conversation grouping, same context --source-id behavior on the destination.

# On the old machine — export everything (or scope with --source-id / --convo / --type)
npx ai-memory-cli export --output backup.ai-memory.json
npx ai-memory-cli export --source-id b5677be8 --output resume-tool.json  # one chat only
npx ai-memory-cli export --convo "coupon" --output coupons.json          # match by title

# Copy / commit / share the bundle (it's a plain JSON file)

# On the new machine — preview first, then apply
npx ai-memory-cli import backup.ai-memory.json --dry-run
npx ai-memory-cli import backup.ai-memory.json               # default: skip duplicates
npx ai-memory-cli import teammate-bundle.json --author me    # remap teammate's memories
npx ai-memory-cli import stale.json --overwrite              # replace local copies

# Rebuild embeddings so semantic search/MCP work on the imported memories
npx ai-memory-cli reindex

Bundle format is versioned (version: 1) and import is idempotent — running the same import twice is a no-op (dedup on author + type + date + title).


MCP Server (NEW)

ai-memory can run as an MCP server, giving AI editors (Cursor, Claude Code) direct access to your knowledge base — no manual commands needed.

Setup

Add to your Cursor MCP config (.cursor/mcp.json):

{
  "mcpServers": {
    "ai-memory": {
      "command": "npx",
      "args": ["ai-memory-cli", "serve"]
    }
  }
}

Or for Claude Code (.claude/claude_desktop_config.json):

{
  "mcpServers": {
    "ai-memory": {
      "command": "npx",
      "args": ["ai-memory-cli", "serve"]
    }
  }
}

What the AI gets

MCP Capability What it does
remember tool AI stores decisions/conventions/todos during conversations (auto-indexed)
recall tool AI retrieves relevant memories using semantic + keyword hybrid search
search_memories tool Full search with type/author/resolved filtering, semantic-aware
project-context resource Auto-provides project context when starting a conversation

Once configured, the AI can automatically remember important decisions and recall them in future sessions — without you running any commands.

Semantic Search

ai-memory uses hybrid search combining semantic similarity (via embeddings), keyword matching, and time decay. This means you can search by meaning, not just exact keywords.

# Build search index (uses your existing LLM API for embeddings)
npx ai-memory-cli reindex

# Now search works semantically — "database choice" finds "PostgreSQL decision"
npx ai-memory-cli search "database choice"

The MCP recall and search_memories tools use hybrid search automatically. Embeddings are stored locally in .ai-memory/.embeddings.json and auto-indexed when using the remember tool.

Manual start (for testing)

npx ai-memory-cli serve           # start MCP server
npx ai-memory-cli serve --debug   # with debug logging

Watch Mode (NEW)

Automatically extract knowledge when conversations change — zero manual effort:

npx ai-memory-cli watch

Watch mode monitors all detected sources for new conversation activity and runs extraction automatically. It uses file system events (for Cursor/Claude Code) and periodic polling (for all sources) to detect changes.

ai-memory watch — auto-extract on conversation changes

   Author: conor
   Output: .ai-memory/
   [+] Watching: Cursor
   [+] Watching: Claude Code

Initial scan complete — watching for changes...

10:15:32 [Cursor] "OAuth refactor discussion" (+8 turns) — extracting...
10:15:37 [+] 2 decision, 1 convention

Press Ctrl+C to stop.


Dashboard (NEW)

Browse, search, and visualize your knowledge base in a local web UI:

npx ai-memory-cli dashboard

Opens http://localhost:3141 with:

  • Overview — stats cards, monthly timeline chart, author breakdown, recent activity
  • Memory browser — real-time search, filter by type/author/status, detail modal
  • Conversations — one card per chat window that produced memories, with a one-click context --source-id copy so you can jump from "which chat did I make that decision in?" to "resume that chat in a new session"
  • Knowledge graph — interactive D3.js force-directed graph (nodes colored by type, edges by shared conversation or keywords)
  • Quality — specificity histogram, vague content list, duplicate/subsumed pairs (powered by the v2.2 algorithm stack)
  • Export — download as JSON, Obsidian vault (with YAML frontmatter), or copy to clipboard
npx ai-memory-cli dashboard --port 8080   # custom port

Clean up existing memories with the new algorithms

If you upgraded from an older version and want to retroactively apply the v2.2 quality algorithms to remove vague/duplicate memories you accumulated earlier:

npx ai-memory-cli reindex --dedup --dry-run   # preview what would be deleted
npx ai-memory-cli reindex --dedup             # actually delete + update index

Typical cleanup on a 200+ memory store removes 20–30% as vague/duplicate/subsumed.


Local LLM Support (NEW)

Use Ollama or LM Studio instead of cloud APIs — no API key needed:

Ollama

# Install Ollama: https://ollama.ai
ollama pull llama3.2              # download a model
ollama pull nomic-embed-text      # (optional) for semantic search

export OLLAMA_HOST=http://localhost:11434
export OLLAMA_MODEL=llama3.2      # extraction model
npx ai-memory-cli extract

LM Studio

# Start LM Studio and load a model
export LM_STUDIO_BASE_URL=http://localhost:1234/v1
export LM_STUDIO_MODEL=your-model-name
npx ai-memory-cli extract

Cloud API keys always take priority over local LLM. If you have OPENAI_API_KEY or AI_REVIEW_API_KEY set, those will be used.

Variable Description
OLLAMA_HOST Ollama server URL (default: http://localhost:11434)
OLLAMA_MODEL Model for extraction (default: llama3.2)
OLLAMA_EMBEDDING_MODEL Model for semantic search (default: nomic-embed-text)
LM_STUDIO_BASE_URL LM Studio server URL (default: http://localhost:1234/v1)
LM_STUDIO_MODEL Model name

Supported Sources

Source Data location Status
Cursor ~/.cursor/projects/{name}/agent-transcripts/ Stable
Claude Code ~/.claude/projects/{path}/*.jsonl Stable
Windsurf ~/AppData/Windsurf/User/workspaceStorage/*/state.vscdb Beta
VS Code Copilot ~/AppData/Code/User/workspaceStorage/*/chatSessions/*.json Beta
Codex CLI ~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl Beta — v2.5+

Typical Workflow

First extraction

npx ai-memory-cli list                    # see what conversations are available
npx ai-memory-cli extract                 # extract everything (few minutes on first run)
npx ai-memory-cli rules                   # generate Cursor Rules
git add .ai-memory/ .cursor/rules/
git commit -m "chore: add ai-memory knowledge base"

Daily use (incremental)

npx ai-memory-cli extract --incremental   # after a productive coding session
npx ai-memory-cli rules                   # refresh Cursor Rules
git add .ai-memory/ && git commit -m "chore: update memories"

Starting a new conversation

npx ai-memory-cli context --copy          # copy context to clipboard
# Paste into new Cursor/Claude Code session

The output looks like:

## Project Context

### Key Decisions (follow without re-discussion)
- **Use OAuth Bridge pattern**: WebView cannot receive redirect directly...

### Conventions (always follow)
- **Never call getServerSideProps in this project**: ...

### Active TODOs
- [ ] Add retry logic to payment webhook handler

Finding specific knowledge

npx ai-memory-cli search "payment"        # find all payment-related memories
npx ai-memory-cli search "auth" --type decision  # only decisions about auth

Team Workflow

When multiple people use ai-memory in the same git repo, each person's memories are automatically stored in their own subdirectory.

How it works

Author identity is auto-detected (priority: --author CLI flag > config.author > git config user.name > OS username). No manual setup needed.

.ai-memory/
├── conor/
│   ├── decisions/
│   │   └── 2026-04-15-oauth-bridge.md
│   └── todos/
│       └── 2026-04-15-add-retry.md
├── alice/
│   ├── decisions/
│   │   └── 2026-04-16-payment-design.md
│   └── architecture/
│       └── 2026-04-16-module-split.md
└── .config.json

Usage

# Everyone extracts normally — writes to their own directory
npx ai-memory-cli extract --incremental

# Generate your own context (default: only your memories)
npx ai-memory-cli context --copy

# Include the whole team's memories
npx ai-memory-cli summary --all-authors
npx ai-memory-cli context --all-authors --copy

# Override author name
npx ai-memory-cli extract --author "alice"

Upgrading existing projects

Memories created before v1.3 are stored in flat directories (.ai-memory/decisions/). After upgrading:

  • Old files are still read normally (backwards compatible), with author empty
  • New extractions go to .ai-memory/{author}/decisions/ etc.
  • No manual migration required

Cross-Device Workflow

Work machine                                   Home machine
────────────                                   ────────────
Cursor / Claude Code dev work
        -> npx ai-memory-cli extract --incremental
        -> git add .ai-memory/
git commit && git push
                                               git pull
                                               -> npx ai-memory-cli context --topic "today's work"
                                               -> Paste context into new conversation
                                               -> Seamlessly resume

Configuration

ai-memory works with zero config. To customize, run npx ai-memory-cli init or create .ai-memory/.config.json manually:

{
  "sources": {
    "cursor": { "enabled": true, "projectName": "my-project" },
    "claudeCode": { "enabled": true },
    "windsurf": { "enabled": true },
    "copilot": { "enabled": true }
  },
  "extract": {
    "types": ["decision", "architecture", "convention", "todo", "issue"],
    "ignoreConversations": [],    // conversation UUIDs to skip
    "minConversationLength": 5   // skip very short conversations
  },
  "output": {
    "dir": ".ai-memory",
    "summaryFile": "SUMMARY.md",
    "language": "zh"             // "zh" or "en" — output language for summaries
  },
  "model": "",                   // leave empty for auto-selection
  "author": ""                   // leave empty to auto-detect from git config user.name
}

Environment variables

Variable Description
AI_REVIEW_API_KEY API key (preferred, shared with ai-review-pipeline)
OPENAI_API_KEY OpenAI API key
OPENAI_BASE_URL Custom OpenAI-compatible API base URL
OPENAI_MODEL Model override for OpenAI
ANTHROPIC_API_KEY Anthropic API key (requires compatible proxy)
ANTHROPIC_BASE_URL Anthropic proxy base URL
AI_REVIEW_BASE_URL Custom API base URL
AI_REVIEW_MODEL Model to use (default: gpt-4o-mini)
OLLAMA_HOST Ollama server URL (default: http://localhost:11434)
OLLAMA_MODEL Ollama model for extraction
OLLAMA_EMBEDDING_MODEL Ollama model for semantic search embeddings
LM_STUDIO_BASE_URL LM Studio API URL
LM_STUDIO_MODEL LM Studio model name

Output Structure

Each memory is its own file, organized by author and type:

.ai-memory/
├── SUMMARY.md                              # Project summary (from `summary` command)
├── conor/                                  # Per-author subdirectory
│   ├── decisions/
│   │   ├── 2026-04-12-oauth-bridge-pattern.md
│   │   └── 2026-04-13-async-job-queue-design.md
│   ├── architecture/
│   │   └── 2026-04-10-payment-module-design.md
│   ├── conventions/
│   │   └── 2026-04-08-coding-conventions.md
│   ├── todos/
│   │   └── 2026-04-12-add-retry-logic.md
│   └── issues/
│       └── 2026-04-11-sqlite-locking-fix.md
├── .index/                                 # Extraction index (auto-managed)
├── .config.json                            # Configuration (commit this)
└── .state.json                             # Extraction state (add to .gitignore)

Add .ai-memory/.state.json to .gitignore — it tracks which conversations have been processed and is machine-specific.


CI Integration

# .github/workflows/memory.yml
- name: Extract AI memories
  run: npx ai-memory-cli extract --incremental --json
  env:
    AI_REVIEW_API_KEY: ${{ secrets.AI_REVIEW_API_KEY }}

Requirements

  • Node.js >= 18
  • An API key for any OpenAI-compatible provider, or a local LLM (Ollama / LM Studio)

Tip: Node.js 22+ enables richer conversation titles by reading Cursor/Windsurf's database. On Node 18-20, titles are extracted from the first message (still works fine).

License

MIT — Conor Liu

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured