MCP Servers

engramia

The memory operations platform for production AI agents — eval-weighted recall, multi-evaluator consensus, GDPR Art. 17/20 governance, multi-tenant RBAC, and audit logging built in.

README

Engramia

Reusable execution memory and evaluation infrastructure for AI agent frameworks.

Status: v0.6.5 — 1200+ tests / 80%+ coverage. Core library · REST API · 9 framework integrations · multi-tenancy · RBAC · async jobs · observability · data governance · ROI analytics. See GitHub Issues for planned work and discussion.

What it is

Engramia solves the problem every agent framework has: agents don't learn from previous runs.

LangChain, CrewAI, OpenAI Agents SDK, Pydantic AI, AutoGen, and similar frameworks are stateless — every run starts from scratch. Engramia is a memory layer you add beneath any framework, and it:

Remembers what worked (success patterns with time-decay)
Finds relevant agents for a new task (semantic search + eval weighting)
Evaluates code quality (multi-evaluator with variance detection)
Improves automatically (feedback injection, pattern aging)
Composes multi-agent pipelines from proven components (Experimental)

Extracted from Agent Factory V2 — a system that reached a 93% task success rate over 254 runs using Engramia as its memory layer.

Who this is for

Primary users:

AI platform teams building multi-agent pipelines
Agent builders using LangChain, CrewAI, OpenAI Agents SDK, Pydantic AI, AutoGen, or custom frameworks
Automation studios running repeated agentic workflows

Not designed for:

End users without agent systems
Pure ML/training workflows
Single-run, one-shot LLM tasks

Feature maturity

Feature	Maturity
`learn` — store run results as success patterns	Stable
`recall` — semantic search over stored patterns	Stable
`evaluate` — multi-LLM scoring with variance detection	Stable
`get_feedback` — recurring quality issues for prompt injection	Stable
`run_aging` — time-decay + prune of stale patterns	Stable
`delete_pattern` — remove a stored pattern	Stable
`metrics` — aggregate run statistics	Stable
`export` / `import_data` — backup and migration	Stable
`register_skills` / `find_by_skills` — capability tagging	Stable
Tenant / project isolation (contextvars scope propagation)	Stable
RBAC (owner / admin / editor / reader)	Stable
DB API key management — bootstrap, create, rotate, revoke	Stable
OIDC SSO — JWT validation, JWKS, role + scope mapping	Stable
Async job layer — `Prefer: respond-async`, job status polling	Stable
Observability — OTel traces, Prometheus `/metrics`, JSON logs	Stable
Deep health — `GET /v1/health/deep` (storage + LLM + embeddings)	Stable
Data governance — PII redaction, retention, GDPR delete/export	Stable
ROI analytics — event collection, rollup API, composite score	Stable
Admin dashboard — separate app, see engramia/dashboard	Stable
`compose` — LLM pipeline decomposition from patterns	Experimental
`evolve_prompt` — LLM-based prompt improvement	Experimental
`analyze_failures` — failure pattern clustering	Experimental

Installation

# Base (JSON storage, no LLM/embeddings provider)
pip install engramia

# With OpenAI provider (recommended to start)
pip install "engramia[openai]"

# REST API + PostgreSQL
pip install "engramia[openai,api,postgres]"

Optional extras

Extra	Contents	Status
`openai`	OpenAI LLM + embeddings provider	✅
`postgres`	PostgreSQL + pgvector storage backend	✅
`api`	FastAPI REST server	✅
`anthropic`	Anthropic/Claude LLM provider	✅
`local`	sentence-transformers embeddings, no API key	✅
`langchain`	LangChain EngramiaCallback	✅
`crewai`	CrewAI EngramiaCrewCallback	✅
`openai-agents`	OpenAI Agents SDK RunHooks + dynamic instructions	✅
`anthropic-agents`	Anthropic Agent SDK query wrapper + hooks	✅
`pydantic-ai`	Pydantic AI Capability (before/after run)	✅
`autogen`	AutoGen Memory interface for AssistantAgent	✅
`cli`	CLI tool (Typer + Rich)	✅
`mcp`	MCP server (Claude Desktop, Cursor, Windsurf)	✅
`oidc`	SSO via OIDC JWT validation (Okta, Azure AD, Auth0, Keycloak)	✅
`dev`	pytest, coverage, development tools	✅

Quick start

from engramia import Memory
from engramia.providers import OpenAIProvider, OpenAIEmbeddings, JSONStorage

mem = Memory(
    llm=OpenAIProvider(model="gpt-4.1"),
    embeddings=OpenAIEmbeddings(),
    storage=JSONStorage(path="./engramia_data"),
)

Python API Reference

`mem.learn(task, code, eval_score, output=None) → LearnResult`

Records the result of a run. Stores the success pattern and updates metrics.

result = mem.learn(
    task="Parse CSV file and compute statistics",
    code="import csv\nimport statistics\n...",
    eval_score=8.5,
    output="mean=42.3, std=7.1",  # optional: agent stdout
)
print(result.stored)        # True
print(result.pattern_count) # total number of patterns

eval_score — number 0–10, how well the agent completed the task
Pattern is automatically deduplicated against existing similar patterns (Jaccard > 0.7)

`mem.recall(task, limit=5, deduplicate=True, eval_weighted=True) → list[Match]`

Finds relevant success patterns for a new task using semantic search.

matches = mem.recall(task="Read CSV and calculate averages", limit=5)

for m in matches:
    print(f"{m.similarity:.2f} | score={m.pattern.success_score:.1f} | {m.pattern.task}")
    print(f"  key: {m.pattern_key}")   # use for delete_pattern()

Each Match contains:

similarity — cosine similarity of embeddings (0.0–1.0)
reuse_tier — "duplicate" / "adapt" / "fresh" based on similarity thresholds
pattern_key — storage key for delete_pattern()
pattern — Pattern object with task, design, success_score, reuse_count

Parameters:

deduplicate=True — groups patterns of the same task (Jaccard > 0.7), returns only top-scoring per group
eval_weighted=True — similarity is multiplied by a multiplier [0.5, 1.0] based on eval score; unrated patterns receive 0.75

`mem.evaluate(task, code, output=None, num_evals=3) → EvalResult`

Runs N independent LLM evaluations and aggregates the results. Requires an llm provider.

result = mem.evaluate(
    task="Parse CSV file",
    code="import csv\n...",
    output="done",    # optional
    num_evals=3,      # number of parallel LLM evaluations (min 1)
)

print(result.median_score)       # aggregated score (0–10)
print(result.variance)           # score variance across runs
print(result.high_variance)      # True if variance > 1.5
print(result.feedback)           # recommendation from the worst run
print(result.adversarial_detected)  # True if code contains hardcoded output

Evaluations run in parallel (ThreadPoolExecutor)
Feedback comes from the worst run (most relevant for improvement)
Adversarial code detection (hardcoded output instead of computation)

`mem.compose(task) → Pipeline` (Experimental)

Decomposes a task into a staged pipeline from existing success patterns. Requires an llm provider.

Experimental: This feature works best as an assistive tool. Pipeline validity depends on pattern coverage and LLM output quality. Do not treat composed pipelines as guaranteed production-ready outputs.

pipeline = mem.compose(task="Fetch stock data, compute moving average, write report")

print(f"valid={pipeline.valid}, errors={pipeline.contract_errors}")
for stage in pipeline.stages:
    print(f"[{stage.task}]  reads={stage.reads}  writes={stage.writes}")

LLM decomposes the task into 2–4 stages
Each stage is matched against success patterns via semantic search
Contract validation verifies data flow consistency (reads/writes chain) including cycle detection
Falls back to a single-stage pipeline if the LLM fails

`mem.get_feedback(task_type=None, limit=5) → list[str]`

Returns recurring feedback patterns for injection into prompts.

feedback = mem.get_feedback(limit=4)
# ["Add error handling for missing input files.",
#  "Validate CSV headers before processing.", ...]

Returns only feedback with count >= 2 (recurring issues)
Sorted by frequency and recency (score × count)
Suitable for automatic injection into the coder's system prompt

`mem.delete_pattern(pattern_key) → bool`

Permanently deletes a stored pattern. Returns True if the pattern existed.

matches = mem.recall(task="Parse CSV")
deleted = mem.delete_pattern(matches[0].pattern_key)
print(deleted)  # True

`mem.run_aging() → int`

Applies time-decay to all success patterns. Returns the number of removed patterns.

pruned = mem.run_aging()
print(f"Removed {pruned} outdated patterns")

Decay: 2% per week (success_score *= 0.98^weeks)
Pattern is removed if success_score < 0.1
Recommended to run periodically (e.g., once a week)

`mem.metrics → Metrics`

Current metrics of the memory instance.

m = mem.metrics

print(m.runs)            # total number of recorded runs
print(m.success_rate)    # fraction of successful runs
print(m.avg_eval_score)  # average eval score (None if no evals)
print(m.pattern_count)   # current number of success patterns
print(m.pipeline_reuse)  # number of runs where an existing pattern was used

`mem.evolve_prompt(role, current_prompt) → EvolutionResult` (Experimental)

Generates an improved prompt based on recurring quality issues.

Experimental: Returns a candidate for manual review and A/B testing — not for direct automatic deployment.

result = mem.evolve_prompt(role="coder", current_prompt="You are a coder...")
if result.accepted:
    print(result.improved_prompt)
    print(f"Changes: {result.changes}")

Analyzes top feedback patterns from eval history
LLM generates an improved version of the prompt
Returns a candidate for manual/automated A/B testing

`mem.analyze_failures(min_count=1) → list[FailureCluster]` (Experimental)

Groups recurring errors into clusters for identifying systemic issues.

Experimental: Cluster quality depends on pattern volume and LLM classification accuracy.

clusters = mem.analyze_failures(min_count=2)
for c in clusters:
    print(f"{c.representative} (count={c.total_count}, members={len(c.members)})")

`mem.register_skills(pattern_key, skills)` / `mem.find_by_skills(required)`

Skill registry for capability-based pattern search.

# Register
matches = mem.recall(task="Parse CSV")
mem.register_skills(matches[0].pattern_key, ["csv_parsing", "statistics"])

# Find
results = mem.find_by_skills(["csv_parsing"], match_all=True)

`mem.export() → list[dict]` / `mem.import_data(records, overwrite=False) → int`

Backup and migration of patterns (JSON storage → PostgreSQL or vice versa).

# Export all patterns to a JSONL file
import json

records = mem.export()
with open("backup.jsonl", "w") as f:
    for r in records:
        f.write(json.dumps(r) + "\n")

# Import from backup into a new instance
with open("backup.jsonl") as f:
    records = [json.loads(line) for line in f]

new_mem = Memory(embeddings=embeddings, storage=postgres_storage)
imported = new_mem.import_data(records)
print(f"Imported {imported} patterns")

Exceptions

Engramia uses a custom exception hierarchy for precise error handling:

from engramia import (
    EngramiaError, ProviderError, ValidationError,
    StorageError, QuotaExceededError, AuthorizationError,
)

try:
    result = mem.evaluate(task, code)
except ProviderError:
    # LLM provider is not configured
    pass
except ValidationError:
    # Invalid input (empty task, code too long, ...)
    pass
except QuotaExceededError:
    # Billing quota exceeded
    pass
except EngramiaError:
    # Any Engramia exception
    pass

Framework integrations

Engramia integrates with every major agent framework. Each integration automatically recalls relevant patterns before a run and learns from the result after.

OpenAI Agents SDK

pip install "engramia[openai-agents]"

from agents import Agent, Runner
from engramia.sdk.openai_agents import EngramiaRunHooks, engramia_instructions

agent = Agent(
    name="coder",
    instructions=engramia_instructions(mem, base="You are a senior developer."),
)
result = await Runner.run(agent, "Build a CSV parser", hooks=EngramiaRunHooks(mem))

Anthropic Agent SDK

pip install "engramia[anthropic-agents]"

from engramia.sdk.anthropic_agents import engramia_query

async for message in engramia_query(mem, prompt="Build a CSV parser"):
    print(message)
# Automatically recalls context → injects into system_prompt → learns from result.

Pydantic AI

pip install "engramia[pydantic-ai]"

from pydantic_ai import Agent
from engramia.sdk.pydantic_ai import EngramiaCapability

agent = Agent('openai:gpt-4o', capabilities=[EngramiaCapability(mem)])
result = agent.run_sync("Build a CSV parser")

AutoGen

pip install "engramia[autogen]"

from autogen_agentchat.agents import AssistantAgent
from engramia.sdk.autogen import EngramiaMemory, learn_from_result

agent = AssistantAgent(name="coder", model_client=client, memory=[EngramiaMemory(mem)])
result = await agent.run(task="Build a CSV parser")
learn_from_result(mem, task="Build a CSV parser", result=result)

LangChain

pip install "engramia[langchain]"

from engramia.sdk.langchain import EngramiaCallback

callback = EngramiaCallback(mem, auto_learn=True, auto_recall=True)
chain.invoke(input, config={"callbacks": [callback]})

CrewAI

pip install "engramia[crewai]"

from engramia.sdk.crewai import EngramiaCrewCallback

callback = EngramiaCrewCallback(mem, auto_learn=True, auto_recall=True)
result = callback.kickoff(crew)

REST API client (any language)

from engramia.sdk.webhook import EngramiaWebhook

client = EngramiaWebhook(url="http://localhost:8000", api_key="sk-...")
client.learn(task="Parse CSV", code=code, eval_score=8.5)
matches = client.recall(task="Read CSV and compute averages")

Generic framework (EngramiaBridge)

from engramia.sdk.bridge import EngramiaBridge

bridge = EngramiaBridge(data_path="./engramia_data")
context = bridge.before_run("Build a CSV parser")
result = my_agent(task, system_context=context)
bridge.after_run("Build a CSV parser", code=result, eval_score=8.0)

REST API

Starting up

# JSON storage (dev, no DB)
docker compose up

# PostgreSQL storage (prod)
ENGRAMIA_STORAGE=postgres \
ENGRAMIA_DATABASE_URL=postgresql://user:pass@localhost:5432/engramia \
OPENAI_API_KEY=sk-... \
docker compose up

After startup: Swagger UI at http://localhost:8000/docs

Configuration (env vars)

Variable	Default	Description
`ENGRAMIA_STORAGE`	`json`	`json` or `postgres`
`ENGRAMIA_DATA_PATH`	`./engramia_data`	Path for JSON storage
`ENGRAMIA_DATABASE_URL`	—	PostgreSQL URL (only for `postgres`)
`ENGRAMIA_LLM_PROVIDER`	`openai`	LLM provider
`ENGRAMIA_LLM_MODEL`	`gpt-4.1`	Model ID
`OPENAI_API_KEY`	—	OpenAI API key
`ENGRAMIA_EMBEDDING_MODEL`	`text-embedding-3-small`	Embedding model
`ENGRAMIA_AUTH_MODE`	`auto`	Auth mode: `auto` / `env` / `db` / `dev`
`ENGRAMIA_API_KEYS`	(empty)	Bearer tokens — used when `AUTH_MODE=env`
`ENGRAMIA_ENVIRONMENT`	(empty)	`local` / `development` / `staging` / `production` — guards dev auth mode
`ENGRAMIA_PORT`	`8000`	Port
`ENGRAMIA_MAINTENANCE`	`false`	Maintenance mode (all endpoints → 503 except health)

Endpoints

All endpoints are available under the /v1/ prefix:

Method	Path	Description
`POST`	`/v1/learn`	Stores a success pattern
`POST`	`/v1/recall`	Finds relevant patterns
`POST`	`/v1/compose`	Assembles a pipeline (Experimental)
`POST`	`/v1/evaluate`	Multi-eval scoring
`POST`	`/v1/aging`	Runs pattern aging (decay + prune)
`POST`	`/v1/feedback/decay`	Runs feedback decay
`POST`	`/v1/evolve`	Generates an improved prompt (Experimental)
`POST`	`/v1/analyze-failures`	Groups failure patterns (Experimental)
`POST`	`/v1/skills/register`	Registers skill tags on a pattern
`POST`	`/v1/skills/search`	Searches patterns by skill tags
`POST`	`/v1/import`	Bulk import patterns
`GET`	`/v1/feedback`	Top recurring feedback
`GET`	`/v1/metrics`	Statistics
`GET`	`/v1/health`	Health check + storage type
`GET`	`/v1/health/deep`	Deep probe: storage + LLM + embeddings latency
`GET`	`/v1/metrics`	Prometheus metrics (if `ENGRAMIA_METRICS=true`)
`DELETE`	`/v1/patterns/{key}`	Deletes a pattern
`POST`	`/v1/keys/bootstrap`	One-time owner key setup
`POST`	`/v1/keys`	Create API key (admin+)
`GET`	`/v1/keys`	List API keys (admin+)
`DELETE`	`/v1/keys/{id}`	Revoke key (admin+)
`POST`	`/v1/keys/{id}/rotate`	Rotate key (admin+)
`GET`	`/v1/jobs`	List async jobs
`GET`	`/v1/jobs/{id}`	Get job status + result
`POST`	`/v1/jobs/{id}/cancel`	Cancel pending job
`GET`	`/v1/governance/retention`	Get retention policy
`PUT`	`/v1/governance/retention`	Set retention policy
`POST`	`/v1/governance/retention/apply`	Run retention cleanup
`GET`	`/v1/governance/export`	NDJSON data export (GDPR Art. 20)
`PUT`	`/v1/governance/patterns/{key}/classify`	Set data classification
`DELETE`	`/v1/governance/projects/{id}`	Delete project data (GDPR Art. 17)
`DELETE`	`/v1/governance/tenants/{id}`	Delete tenant data (GDPR Art. 17)
`POST`	`/v1/analytics/rollup`	Trigger ROI rollup computation
`GET`	`/v1/analytics/rollup/{window}`	Fetch ROI snapshot (hourly/daily/weekly)
`GET`	`/v1/analytics/events`	Raw ROI events

Examples

# Learn
curl -X POST http://localhost:8000/v1/learn \
  -H "Content-Type: application/json" \
  -d '{"task": "Parse CSV", "code": "import csv", "eval_score": 8.5}'

# Recall
curl -X POST http://localhost:8000/v1/recall \
  -H "Content-Type: application/json" \
  -d '{"task": "Read CSV and compute averages", "limit": 3}'

# Metrics
curl http://localhost:8000/v1/metrics

# Health
curl http://localhost:8000/v1/health

Authentication

# Set keys
ENGRAMIA_API_KEYS=my-secret-key docker compose up

# Use key
curl -H "Authorization: Bearer my-secret-key" http://localhost:8000/v1/metrics

Security configuration

Variable	Default	Description
`ENGRAMIA_CORS_ORIGINS`	(empty)	Allowed CORS origins (CORS disabled if empty)
`ENGRAMIA_RATE_LIMIT_DEFAULT`	`60`	Max requests/min for standard endpoints
`ENGRAMIA_RATE_LIMIT_EXPENSIVE`	`10`	Max requests/min for LLM-intensive endpoints
`ENGRAMIA_MAX_BODY_SIZE`	`1048576`	Max request body size in bytes (1 MB)

Observability configuration

Variable	Default	Description
`ENGRAMIA_TELEMETRY`	`false`	Enable OpenTelemetry tracing
`ENGRAMIA_OTEL_ENDPOINT`	—	OTLP gRPC endpoint (e.g. `http://otel-collector:4317`)
`ENGRAMIA_METRICS`	`false`	Enable Prometheus `/metrics` endpoint
`ENGRAMIA_JSON_LOGS`	`false`	Emit structured JSON logs (request_id, trace_id, tenant_id)

PostgreSQL storage

Starting with pgvector backend:

# 1. Uncomment pgvector service in docker-compose.yml

# 2. Start
ENGRAMIA_STORAGE=postgres \
ENGRAMIA_DATABASE_URL=postgresql://engramia:engramia@pgvector:5432/engramia \
docker compose up

# 3. Apply migrations (first run)
docker compose exec engramia-api alembic upgrade head

Or without Docker:

pip install "engramia[openai,postgres]"

from engramia.providers.postgres import PostgresStorage
storage = PostgresStorage(database_url="postgresql://...")
mem = Memory(embeddings=OpenAIEmbeddings(), storage=storage, llm=OpenAIProvider())

Provider configuration

OpenAI (recommended)

import os
os.environ["OPENAI_API_KEY"] = "sk-..."

from engramia.providers import OpenAIProvider, OpenAIEmbeddings, JSONStorage

mem = Memory(
    llm=OpenAIProvider(model="gpt-4.1"),
    embeddings=OpenAIEmbeddings(model="text-embedding-3-small"),
    storage=JSONStorage(path="./engramia_data"),
)

Embeddings only, without LLM

mem = Memory(
    embeddings=OpenAIEmbeddings(),
    storage=JSONStorage(path="./engramia_data"),
    llm=None,  # default
)

mem.learn(...)    # ✅ works
mem.recall(...)   # ✅ works
mem.evaluate(...) # ❌ ProviderError: evaluate() requires llm=...

CLI

# Installation
pip install "engramia[cli]"

# Initialize
engramia init --path ./engramia_data

# Start REST API server
engramia serve --host 0.0.0.0 --port 8000

# Metrics and statistics
engramia status --path ./engramia_data

# Semantic search
engramia recall "Parse CSV and compute statistics" --limit 5

# Pattern aging (decay + prune)
engramia aging --path ./engramia_data

MCP Server

Engramia can be run as an MCP server (Model Context Protocol) and connected directly to Claude Desktop, Cursor, Windsurf, or VS Code Copilot.

Installation

pip install "engramia[openai,mcp]"

Starting up

engramia-mcp

The server runs via stdio transport — the MCP client starts it as a subprocess automatically.

Client configuration

Claude Desktop (~/.config/claude/claude_desktop_config.json on Linux/macOS, %APPDATA%\Claude\claude_desktop_config.json on Windows):

{
  "mcpServers": {
    "engramia": {
      "command": "engramia-mcp",
      "env": {
        "ENGRAMIA_DATA_PATH": "/path/to/engramia_data",
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Cursor / Windsurf — same JSON format in the MCP servers settings of the respective IDE.

Available MCP tools

Tool	Description
`engramia_learn`	Stores a run result as a success pattern
`engramia_recall`	Finds relevant patterns for a new task (semantic search)
`engramia_evaluate`	N independent LLM evaluations, median + variance
`engramia_compose`	Decomposes a task into a validated multi-agent pipeline (Experimental)
`engramia_feedback`	Returns recurring quality issues for injection into prompts
`engramia_metrics`	Statistics (runs, success rate, pattern count, reuse rate)
`engramia_aging`	Runs time-based decay + prune of outdated patterns

Configuration (env vars)

The MCP server uses the same env vars as the REST API:

Variable	Default	Description
`ENGRAMIA_STORAGE`	`json`	`json` or `postgres`
`ENGRAMIA_DATA_PATH`	`./engramia_data`	Path for JSON storage
`ENGRAMIA_DATABASE_URL`	—	PostgreSQL URL (only for `postgres`)
`ENGRAMIA_LLM_PROVIDER`	`openai`	LLM provider
`ENGRAMIA_LLM_MODEL`	`gpt-4.1`	Model ID
`OPENAI_API_KEY`	—	OpenAI API key

Architecture

engramia/
├── memory.py            # Memory facade — thin delegator (~165 LOC)
├── types.py             # Pydantic models (Scope, AuthContext, Pattern, ...)
├── exceptions.py        # EngramiaError hierarchy (ProviderError, StorageError, ...)
├── _context.py          # Scope contextvar: get_scope / set_scope / reset_scope
├── _util.py             # Helpers (extract_json_from_llm, jaccard, reuse_tier)
├── _factory.py          # Provider factory from env vars
│
├── core/                # Pattern storage + evaluation
│   ├── services/             # Business logic (extracted from Memory god object)
│   │   ├── learning.py       # LearningService — store patterns, embeddings, ROI
│   │   ├── recall.py         # RecallService — semantic search, dedup, eval-weighted
│   │   ├── evaluation.py     # EvaluationService — multi-evaluator scoring + feedback
│   │   └── composition.py    # CompositionService — LLM pipeline decomposition
│   ├── success_patterns.py   # Aging, reuse boost
│   ├── eval_store.py         # Eval history + quality multiplier
│   ├── eval_feedback.py      # Feedback clustering + decay
│   ├── metrics.py            # Run statistics
│   └── skill_registry.py     # Capability-based pattern tagging
│
├── reuse/               # Reuse engine
│   ├── matcher.py            # Semantic search + eval weighting
│   ├── composer.py           # LLM pipeline decomposition (Experimental)
│   └── contracts.py          # Data-flow validation + cycle detection
│
├── eval/
│   └── evaluator.py          # MultiEvaluator (concurrent, median, variance)
│
├── providers/
│   ├── base.py               # ABC: LLMProvider, EmbeddingProvider, StorageBackend
│   ├── openai.py             # OpenAI LLM + embeddings
│   ├── anthropic.py          # Anthropic/Claude LLM provider
│   ├── local_embeddings.py   # sentence-transformers (no API key)
│   ├── json_storage.py       # JSON storage (thread-safe, atomic writes)
│   └── postgres.py           # PostgreSQL + pgvector (scope-aware queries)
│
├── api/                 # REST API
│   ├── app.py                # App factory, lifespan, startup/shutdown
│   ├── routes.py             # Core endpoints (learn, recall, evaluate, ...)
│   ├── auth.py               # Multi-mode auth (auto/env/db/dev)
│   ├── keys.py               # API key management router (/v1/keys)
│   ├── permissions.py        # RBAC: 4 roles, require_permission() factory
│   ├── deps.py               # Dependency injection (Memory singleton, AuthContext)
│   ├── schemas.py            # Request/response Pydantic models
│   ├── audit.py              # Structured audit logging
│   ├── middleware.py         # SecurityHeaders, RateLimit, BodySize, RequestID, Timing
│   └── prom_metrics.py       # Prometheus counter/histogram definitions
│
├── jobs/                # Async job queue
│   ├── service.py            # JobService — submit, poll, cancel, retry/backoff
│   ├── worker.py             # JobWorker — daemon thread, ThreadPoolExecutor
│   └── dispatch.py           # Operation → Memory method dispatcher
│
├── analytics/           # ROI analytics
│   ├── collector.py          # ROICollector — fire-and-ignore event recorder
│   ├── aggregator.py         # ROIAggregator — hourly/daily/weekly rollups
│   └── models.py             # ROIEvent, ROIRollup, RecallOutcome, LearnSummary
│
├── governance/          # Data governance (GDPR)
│   ├── redaction.py          # PII/secrets redaction pipeline
│   ├── retention.py          # RetentionManager — per-tenant/project TTL policies
│   ├── deletion.py           # ScopedDeletion — GDPR Art. 17 right to erasure
│   ├── export.py             # DataExporter — NDJSON export (GDPR Art. 20)
│   └── lifecycle.py          # Async lifecycle jobs (retention_cleanup, compact_audit)
│
├── telemetry/           # Observability (opt-in, zero overhead when disabled)
│   ├── tracing.py            # OTel init + @traced decorator
│   └── metrics.py            # Prometheus histogram/counter definitions
│
├── db/                  # Database
│   ├── models.py             # SQLAlchemy 2.x models (13 migrations applied)
│   └── migrations/           # Alembic (001_initial → 013_cloud_users)
│
├── evolution/           # Prompt evolution + failure clustering (Experimental)
│   ├── prompt_evolver.py
│   └── failure_cluster.py
│
├── sdk/                 # Framework integrations (9 adapters)
│   ├── openai_agents.py      # OpenAI Agents SDK — RunHooks + dynamic instructions
│   ├── anthropic_agents.py   # Anthropic Agent SDK — query wrapper + hooks
│   ├── pydantic_ai.py        # Pydantic AI — Capability (before/after run)
│   ├── autogen.py            # AutoGen — Memory ABC for AssistantAgent
│   ├── langchain.py          # LangChain — EngramiaCallback
│   ├── crewai.py             # CrewAI — EngramiaCrewCallback
│   ├── bridge.py             # EngramiaBridge — drop-in for any agent factory
│   └── webhook.py            # Lightweight HTTP SDK client (stdlib only)
│
├── cli/                 # CLI tool (Typer + Rich)
│
└── mcp/                 # MCP server (stdio transport)
    └── server.py             # 7 MCP tools (learn, recall, evaluate, ...)

Implementation status

Component	Status
`mem.learn()`	✅ Stable
`mem.recall()`	✅ Stable
`mem.evaluate()`	✅ Stable
`mem.compose()`	✅ Experimental
`mem.get_feedback()`	✅ Stable
`mem.run_aging()`	✅ Stable
`mem.delete_pattern()`	✅ Stable
`mem.evolve_prompt()`	✅ Experimental
`mem.analyze_failures()`	✅ Experimental
`mem.register_skills()` / `find_by_skills()`	✅ Stable
`mem.metrics`	✅ Stable
`mem.export()` / `mem.import_data()`	✅ Stable
Custom exception hierarchy	✅ Stable
OpenAI provider	✅ Stable
Anthropic provider	✅ Stable
Local embeddings (sentence-transformers)	✅ Stable
JSON storage (thread-safe, concurrent)	✅ Stable
PostgreSQL + pgvector (scope-aware)	✅ Stable
Docker + docker-compose	✅ Stable
Multi-tenancy + scope isolation (contextvars)	✅ Stable
RBAC (4 roles, DB API key management)	✅ Stable
Async job layer (SKIP LOCKED, retry, backoff)	✅ Stable
Observability (OTel, Prometheus, JSON logs)	✅ Stable
Data governance (PII redaction, retention, GDPR)	✅ Stable
ROI analytics (collector, rollup, REST API)	✅ Stable
Admin dashboard (separate repo: engramia/dashboard)	✅ Stable
Service layer (LearningService, RecallService, ...)	✅ Stable
OpenAI Agents SDK integration (RunHooks + instructions)	✅ Stable
Anthropic Agent SDK integration (query wrapper + hooks)	✅ Stable
Pydantic AI integration (Capability)	✅ Stable
AutoGen integration (Memory ABC)	✅ Stable
LangChain EngramiaCallback	✅ Stable
CrewAI EngramiaCrewCallback	✅ Stable
EngramiaBridge (drop-in agent factory adapter)	✅ Stable
Webhook SDK client	✅ Stable
CLI (Typer + Rich)	✅ Stable
MCP server (7 tools, stdio transport)	✅ Stable

Development and testing

# Install for development
pip install -e ".[dev,openai]"

# Run tests
pytest

# With coverage report
pytest --cov=engramia --cov-report=term-missing

Tests do not require API keys — they use FakeEmbeddings (deterministic vectors from MD5 hash) and a mocked LLM. FastAPI tests use TestClient from httpx.

Origin

Extracted from Agent Factory V2 — a self-improving AI agent factory. Agent Factory V2 remains as an open-source reference implementation that demonstrates Engramia working in a production-grade multi-agent system.

License

Engramia is licensed under Business Source License 1.1 (BUSL-1.1).

✅ Free for: personal use, evaluation, research
❌ Not allowed without a commercial license:
- production use in commercial environments
- SaaS / hosted services
- integration into paid products
- building competing products

On the Change Date specified in LICENSE.txt, the license automatically converts to Apache 2.0.

See LICENSE.txt for full terms.

For commercial licenses: support@engramia.dev

License FAQ

Why does PyPI show "UNKNOWN" or no license classifier? BUSL-1.1 is not OSI-approved, so PyPI lacks a dedicated classifier. Engramia ships the full BUSL-1.1 text in LICENSE.txt and declares it via license = {file = "LICENSE.txt"} in pyproject.toml. Package scanners that flag "Proprietary" for non-OSI licenses are technically correct — BUSL-1.1 is source-available, not open source or proprietary in the classical sense. On the Change Date, the license converts to Apache 2.0 (OSI-approved).

Can I use Engramia in my company's internal tools? Yes — internal, non-competing production use is permitted under the Additional Use Grant in LICENSE.txt. Re-hosting Engramia as a paid service, or embedding it in a competing commercial offering, requires a commercial license.

Can I fork and modify the code? Yes. Modifications and redistribution under BUSL-1.1 terms are allowed. Any derivative work carries the same license until the Change Date.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

engramia

README

Engramia

What it is

Who this is for

Feature maturity

Installation

Optional extras

Quick start

Python API Reference

mem.learn(task, code, eval_score, output=None) → LearnResult

mem.recall(task, limit=5, deduplicate=True, eval_weighted=True) → list[Match]

mem.evaluate(task, code, output=None, num_evals=3) → EvalResult

mem.compose(task) → Pipeline (Experimental)

mem.get_feedback(task_type=None, limit=5) → list[str]

mem.delete_pattern(pattern_key) → bool

mem.run_aging() → int

mem.metrics → Metrics

mem.evolve_prompt(role, current_prompt) → EvolutionResult (Experimental)

mem.analyze_failures(min_count=1) → list[FailureCluster] (Experimental)

mem.register_skills(pattern_key, skills) / mem.find_by_skills(required)

mem.export() → list[dict] / mem.import_data(records, overwrite=False) → int

Exceptions

Framework integrations

OpenAI Agents SDK

Anthropic Agent SDK

Pydantic AI

AutoGen

LangChain

CrewAI

REST API client (any language)

Generic framework (EngramiaBridge)

REST API

Starting up

Configuration (env vars)

Endpoints

Examples

Authentication

Security configuration

Observability configuration

PostgreSQL storage

Provider configuration

OpenAI (recommended)

Embeddings only, without LLM

CLI

MCP Server

Installation

Starting up

Client configuration

Available MCP tools

Configuration (env vars)

Architecture

Implementation status

Development and testing

Origin

License

License FAQ

Recommended Servers

`mem.learn(task, code, eval_score, output=None) → LearnResult`

`mem.recall(task, limit=5, deduplicate=True, eval_weighted=True) → list[Match]`

`mem.evaluate(task, code, output=None, num_evals=3) → EvalResult`

`mem.compose(task) → Pipeline` (Experimental)

`mem.get_feedback(task_type=None, limit=5) → list[str]`

`mem.delete_pattern(pattern_key) → bool`

`mem.run_aging() → int`

`mem.metrics → Metrics`

`mem.evolve_prompt(role, current_prompt) → EvolutionResult` (Experimental)

`mem.analyze_failures(min_count=1) → list[FailureCluster]` (Experimental)

`mem.register_skills(pattern_key, skills)` / `mem.find_by_skills(required)`

`mem.export() → list[dict]` / `mem.import_data(records, overwrite=False) → int`