Flux Memory
Flux Memory is an MCP server that provides a self-organizing weighted graph memory for AI agents, enabling persistent knowledge storage, retrieval, and feedback-based learning.
README
Flux Memory
Self-organizing retrieval fabric for AI memory.
Flux Memory is an AI memory system that persists knowledge as a self-modifying weighted graph. It learns which memories matter through feedback signals - reinforcing useful grains, decaying stale ones, and automatically clustering related knowledge.
Features
- Graph-based memory - grains connected by weighted conduits, propagated via signal attenuation
- Self-organizing - lazy decay, Louvain clustering, automatic promotion/demotion, shortcut reinforcement
- Three access paths - MCP server (for AI agents), REST API (HTTP), Python SDK
- Booth architecture - concurrent read workers, serial write queue, async feedback queue
- Per-caller rate limiting - 500 grains/min default, configurable per instance
- Admin authentication - argon2 password hashing, TOTP 2FA (RFC 6238), session tokens
- Two operating modes -
flux_extracts(local Ollama LLM) orcaller_extracts(AI provides features)
Quick Start
Install
pip install flux-memory
Windows fallback if flux is not on PATH:
python -m flux --help
python -m flux init --name my-memory
For CLI-first installs, pipx install flux-memory is recommended because it manages command shims and PATH setup.
Initialize an instance
flux init --name my-memory
This prompts for:
- Operating mode (
caller_extractsorflux_extracts) - Admin password (argon2-hashed)
- Optional TOTP two-factor authentication, with terminal QR and first-code verification
Initialization also writes MCP client snippets under:
~/.flux/<name>/integrations/
Start services
flux start --name my-memory
Starts:
- REST API health endpoint at
http://localhost:7465/health - Dashboard at
http://localhost:7462
To view the dashboard from a phone on the same local network, start with:
flux start --name my-memory --broadcast
This binds the dashboard to 0.0.0.0, prints LAN URLs such as
http://192.168.x.x:7462, and serves a device-frame preview at
/mobile-preview. The REST API remains local-only by default.
For private access from outside the local network, use a tailnet/VPN such as Tailscale instead of router port forwarding. Keep Flux running locally, sign in to Tailscale on this machine and the remote device, then publish only the dashboard to your private tailnet:
flux start --name my-memory
tailscale serve --http=7462 http://127.0.0.1:7462
tailscale serve status
Then open http://<tailscale-device-name>:7462 from another signed-in
Tailscale device. This keeps the dashboard private to your tailnet and does not
expose the REST API or MCP transport to the public internet.
If you do not want an overlay app, the private alternative is your own VPN
endpoint, usually on your router/firewall. Connect to that VPN from the office
using an OS-supported VPN profile, then open the dashboard over the home LAN
address printed by flux start --name my-memory --broadcast. Do not forward
port 7462 directly from the router to the internet.
flux start does not make the stdio MCP server discoverable by itself. MCP clients launch stdio servers directly. Use the generated snippet or run:
flux mcp --name my-memory
from your MCP client configuration.
In other words, flux start starts only the REST API and dashboard. It does
not start a background network MCP server that Codex, Claude, Cursor, or other
clients can auto-detect. Each MCP client must have its own config entry.
Stop services
flux stop --name my-memory
Check status
flux status --name my-memory
MCP Integration
Connect Flux Memory to any MCP-compatible AI agent. Flux uses stdio MCP by default, so the client must launch Flux.
Generate or refresh client snippets:
flux mcp-config --name my-memory
Codex example:
[mcp_servers."flux-my-memory"]
command = "python"
args = ["-m", "flux.cli", "mcp", "--name", "my-memory"]
On first connection, call flux_onboard to receive integration instructions:
flux_onboard() -> returns workflow instructions + operating mode
Standard workflow per conversation turn:
flux_retrieve(query)- fetch relevant memories before respondingflux_store(content, provenance)- save new facts after respondingflux_feedback(trace_id, grain_id, useful)- rate each retrieved grain
Every client should also send a portable caller identity:
client: any stable AI/tool name, such ascodex,claude, orlocal-agent-1role: one ofchat,memory_writer,background_lookup,system,admin,test
Use caller_id="<client>:<role>", for example local-agent-1:chat.
MCP clients may instead send separate client and role fields.
For clients such as Codex, save the flux_onboard instructions into an
always-loaded instruction surface, such as a project or user AGENTS.md.
Saving the workflow only as a memory note is not enough, because the agent must
already remember to use Flux before it can retrieve that note.
Available MCP tools:
| Tool | Description |
|---|---|
flux_store |
Store a memory grain |
flux_retrieve |
Retrieve relevant memories |
flux_feedback |
Rate a retrieved grain (learning signal) |
flux_health |
Current health and signal statistics |
flux_list_grains |
List grains by status (active/dormant/quarantined/archived) |
flux_onboard |
Get integration instructions for this instance |
REST API
POST /store {"content": "...", "provenance": "user_stated"}
POST /store/batch {"items": [{"content": "..."}]}
POST /retrieve {"query": "..."}
POST /feedback {"trace_id": "...", "grain_id": "...", "useful": true}
GET /health
GET /grains?status=active&limit=50
Pass X-Flux-Client and X-Flux-Role headers for caller attribution, or use
legacy X-Caller-Id: <client>:<role>. Dashboard compliance groups calls by
client and role.
Python SDK
from flux.storage import FluxStore
from flux.service import FluxService
from flux.config import Config
store = FluxStore("~/.flux/my-memory/flux.db")
svc = FluxService(store, cfg=Config())
svc.start()
grain_id = svc.store("Paris is the capital of France", provenance="user_stated")
result = svc.retrieve("French capital")
svc.feedback(result.trace_id, result.grains[0]["id"], useful=True)
svc.stop()
store.close()
Configuration
Instance config lives at ~/.flux/<name>/config.yaml. Key parameters:
| Parameter | Default | Description |
|---|---|---|
OPERATING_MODE |
flux_extracts |
LLM extraction mode |
MCP_HOST |
127.0.0.1 |
Reserved for network MCP transports |
MCP_PORT |
7464 |
Reserved MCP port |
REST_HOST |
127.0.0.1 |
REST bind host |
REST_PORT |
7465 |
REST API port |
DASHBOARD_HOST |
127.0.0.1 |
Dashboard bind host |
DASHBOARD_PORT |
7462 |
Dashboard port |
READ_WORKERS |
3 |
Concurrent read workers |
MAX_GRAINS_PER_CALL |
100 |
Batch ingestion cap |
MAX_GRAINS_PER_MINUTE |
500 |
Per-caller rate limit |
MAX_WRITE_QUEUE_DEPTH |
1000 |
Write queue backpressure cap |
FEEDBACK_ENFORCEMENT_ENABLED |
true |
Require callers to submit feedback before repeated retrieval |
FEEDBACK_ENFORCEMENT_GRACE_SECONDS |
60 |
Delay before missing feedback blocks the same caller |
FEEDBACK_ENFORCEMENT_MAX_BLOCK_SECONDS |
86400 |
Maximum time a stale missing-feedback item can block retrieval |
LLM_MODEL |
llama3.1:8b |
Ollama model (flux_extracts mode) |
Admin
flux admin --name my-memory
Password + TOTP gated interactive menu: search/purge/restore grains, view audit log, change password, open dashboard.
Requirements
- Python 3.10+
- SQLite 3.35+ (WAL mode)
- For
flux_extractsmode: Ollama with any configured local model
Development
git clone https://github.com/harsh5i/flux-memory
cd flux-memory
pip install -e ".[test]"
pytest tests/
License
MIT - see LICENSE
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.