Never Ask Twice

Never Ask Twice

Enterprise support memory agent that remembers customer context across sessions, retrieves relevant memories, and forgets stale facts to reduce re-ask rates.

Category
Visit Server

README

Never Ask Twice

Enterprise Support MemoryAgent on Qwen Cloud

License: Apache-2.0 TypeScript Runtime Cloud Model Memory MCP

Never Ask Twice is a production-shaped B2B support memory agent that remembers customer context across sessions, retrieves only the memories that matter, forgets stale facts safely, and proves improvement with a deterministic memory ON/OFF evaluation harness plus a live Qwen-backed API path.

The demo agent is Nat. Nat is powered by NATE — the Never Ask Twice Engine — a scoped memory layer that turns support conversations into durable, auditable customer context.

Built for the Qwen Cloud Global AI Hackathon — Track: MemoryAgent.

Status

Area Status Notes
Public clean-room repo Done Synthetic data only; boundary scan included.
Local Postgres + pgvector setup Done docker compose up -d binds Postgres on localhost:5433.
Deterministic eval harness Done pnpm eval prints memory ON/OFF re-ask, recall, and hallucination metrics.
Memory service Done Working, episodic, semantic, forgetting, and budgeted recall paths are implemented.
MCP stdio surface Done Four memory tools are exposed through pnpm mcp:list-tools.
Qwen-backed live path In progress Requires DASHSCOPE_API_KEY; local-safe mode runs without it.
Alibaba Function Compute deployment In progress s.yaml and deployment instructions exist; final live proof is still required.
Demo video Pending Should use the frozen Acme scenario and the eval output line.

Judge path

  1. Read the memory model: docs/memory-model.md.
  2. Run the ablation:
    pnpm eval
    
  3. Inspect the forgetting behavior: docs/forgetting-policy.md.
  4. Read the system architecture: docs/architecture.md.
  5. List MCP tools:
    pnpm build
    pnpm mcp:list-tools
    
  6. Review the deployment instructions and proof placeholder: deploy/alibaba-fc.md.

The measurable result

Run:

pnpm eval

Expected deterministic fixture output:

memory-on re-ask rate: 0.00
memory-on recall accuracy: 1.00
memory-on hallucination count: 0
memory-off re-ask rate: 1.00
memory-off recall accuracy: 0.00
memory-off hallucination count: 0
re-ask rate: 0.00 (memory) vs 1.00 (no-memory)

The evaluation path is intentionally deterministic for reproducible scoring. It uses fixed synthetic fixtures and a fake Qwen client. The live API path uses Qwen Cloud when DASHSCOPE_API_KEY is configured.

What makes it a MemoryAgent

Never Ask Twice implements explicit memory tiers:

  • Working memory — current-session context usable before session-close distillation.
  • Episodic memory — raw support events with Qwen embeddings and provenance.
  • Semantic memory — distilled customer facts with confidence, validity windows, and source links.
  • Forgetting policy — TTL expiry, supersession, stale-memory exclusion, and audit-safe provenance.
  • Budgeted recall — relevant memories only, capped to a strict context budget.
  • MCP surface — memory tools exposed for agent interoperability.

This is not transcript logging. It is structured memory with retrieval discipline, provenance, forgetting, and measurable cross-session improvement.

Architecture

Customer chat / MCP
        |
        v
Hono API on Alibaba Function Compute
        |
        v
MemoryService
  |-- working memory: current-session facts
  |-- episodic memory: turn events + Qwen embeddings
  |-- semantic memory: distilled durable facts
  |-- forgetting: TTL + supersession + scoped recall
        |
        +--> Qwen Cloud via DashScope-compatible OpenAI API
        +--> Postgres + pgvector
        +--> MCP stdio tools

More detail: docs/architecture.md.

Getting started

1. Clone and install

git clone https://github.com/marcelle-labs/never-ask-twice.git
cd never-ask-twice
pnpm install

2. Configure environment

cp .env.example .env

Edit .env and set your DASHSCOPE_API_KEY from DashScope for live Qwen-backed embeddings, distillation, and adjudication. The example is pre-filled for local Postgres on port 5433.

DATABASE_URL=postgresql://neverasktwice:neverasktwice@localhost:5433/neverasktwice
DASHSCOPE_API_KEY=your-key-here
QWEN_BASE_URL=https://dashscope-intl.aliyuncs.com/compatible-mode/v1
QWEN_CHAT_MODEL=qwen-plus
QWEN_EMBEDDING_MODEL=text-embedding-v3
QWEN_EMBEDDING_DIM=1024
MEMORY_TOKEN_BUDGET=1200

Without DASHSCOPE_API_KEY, the API boots in local-safe mode. Local-safe mode uses zero-vector embeddings and empty distillation responses so the server can run without secrets; it does not perform real Qwen work. Use pnpm eval for deterministic local scoring without a key.

3. Start Postgres

docker compose up -d

The local database binds to localhost:5433 so it does not collide with other Postgres services on 5432.

4. Run migrations

pnpm migrate

5. Run the eval harness

pnpm eval

6. Run the boundary scan

pnpm boundary-scan

7. Start the local API

pnpm dev

The API will be available at http://localhost:3000 with endpoints:

  • GET /health — health and capability status.
  • POST /turn — append a customer/agent turn.
  • POST /sessions/:id/close — close a session and distill episodic memory into semantic memory.
  • POST /recall — recall a bounded memory bundle.

8. Run the MCP server

pnpm build
node dist/src/mcp/server.js

The MCP server exposes four tools: recall_memory, write_memory, distill_session, and forget.

Project structure

  • apps/api — Hono API, local server, and Function Compute handler.
  • src/agent — deterministic support-agent policy used by the eval harness.
  • src/contracts.ts — memory predicate enum, Zod contracts, and shared types.
  • src/db — Drizzle schema and SQL migration string.
  • src/memory — memory service, stores, retrieval, supersession, and forgetting behavior.
  • src/mcp — stdio MCP surface over the shared memory service.
  • src/qwen — single Qwen Cloud client module.
  • src/testing — deterministic fake Qwen client for the eval harness.
  • eval — frozen three-session scenario, ground truth, expected output, and runner.
  • scripts — boundary scan, migration, MCP list-tools, and demo script checks.
  • docs — judge-facing architecture, memory model, evaluation, and forgetting documentation.
  • deploy — Alibaba Function Compute deployment instructions and proof placeholder.

Key commands

Command Purpose
pnpm install Install dependencies
pnpm build Build the project
pnpm lint Run TypeScript type check
pnpm test Run the test suite
pnpm eval Run the deterministic memory ON/OFF eval harness
pnpm migrate Run database migrations
pnpm boundary-scan Run the clean-room boundary scan
pnpm mcp:list-tools List the MCP tools
pnpm demo:script-check Verify demo fixtures are aligned

Security and clean-room boundary

Never Ask Twice uses synthetic data only. Do not commit real customer data, secrets, .env files, or private platform identifiers. The repository includes a boundary scan to fail on known forbidden tokens and a local-safe mode so judges can run the server without secrets.

See SECURITY.md.

License

Apache-2.0

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured