user-story-context-mcp-server
Enables MCP-capable agents to retrieve relevant slices of a user-story knowledge graph (stored in Supabase + pgvector) using three retrieval verbs: find_related, find_crossover, and query_stories.
README
user-story-context-mcp-server
A remote, streamable-HTTP MCP server that exposes our user-story knowledge graph (Supabase Postgres + pgvector) to any MCP-capable agent as three retrieval verbs. An agent holding a task — a code diff, a feature idea, a competitor blurb, a problem statement — pulls the relevant slice of product knowledge on demand instead of all of it or none of it.
POC status. Implements PRD Phases 1–2 in full and the Phase-3 contracts. Built on the official MCP TypeScript SDK. Storage is real Supabase + pgvector via SQL migrations you run in the Supabase SQL editor.
The three verbs
| Verb | Question shape | Tool |
|---|---|---|
| things like this | free text / code in → ranked areas/stories out | find_related |
| things entangled with this | a known section/story → other areas sharing its footprint | find_crossover |
| things matching attributes | exact, complete, deterministic filter/aggregate | query_stories |
Plus two optional orientation resources: schema://taxonomy and docs://how-to-query.
Tools alone fully work; resources are an enhancement.
find_related(context, mode?, scope?, limit?)
Primary entry point. Embeds context (Supabase/gte-small), KNN-gates a candidate
pool via pgvector, then fuses vector cosine + weighted entity/path overlap.
- Code-vs-prose fork: if the context looks like code (or
mode="structural"), it leans on code-path overlap rather than vector similarity. - Defaults to
blendedso a naive agent gets sensible results. - Returns an empty list when we genuinely lack a matching pattern — it does not
force five matches (gated by an absolute
min_score).
find_crossover(section_key? | story_key?, limit?)
Structural entanglement for a key you already have. Ranks other sections by shared
paths/slugs, weighted by 1/df so hub tags (folder, df≈35) don't drown out
distinctive ones (watermark, df≈8). Every result names the shared paths/slugs (the "why").
query_stories(status?, section_key?, actor?, entity_slug?, code_path?, group_by?, limit?)
A guarded SELECT: whitelisted, AND-combined top-level filters (status,
section_key, actor, entity_slug, code_path), read-only, server-enforced
LIMIT. The SQL is ours; only the values are the agent's. Optional group_by
returns counts. Every response echoes applied_filters so callers can confirm what
the server actually filtered on. (A nested filters={...} object is still accepted
for back-compat; top-level values win.)
Architecture
agent ──MCP/streamable-HTTP──► Express (/mcp, stateless)
│
├─ find_related ─► generate-embedding edge fn (gte-small) ─► pgvector KNN ─► fuse (ranking.ts)
├─ find_crossover ► footprint + 1/df overlap (SQL + app)
└─ query_stories ─► guarded parameterized SELECT
│
└─ Supabase Postgres + pgvector
sections · user_stories(embedding vector(384)) · entities · code_assets · joins
│
└─ Automatic Embeddings: trigger ─► pgmq ─► pg_cron ─► `embed` edge fn (gte-small) ─► writes embedding
- Transport: Streamable HTTP, stateless JSON (
sessionIdGenerator: undefined,enableJsonResponse: true) — a fresh server per request, trivial to scale horizontally on a remote MCP host.stdiois also supported for local dev. - Embeddings run inside Supabase — the server carries no ML dependency.
- Stored rows: Automatic Embeddings — a trigger enqueues the row in
pgmq,pg_crondrains it and calls theembededge function (Supabase.aigte-small, 384-dim), which writes the vector back. Embeddings stay fresh as stories change. - Query text:
find_relatedcalls thegenerate-embeddingedge function (same gte-small model) so the query vector lives in the same space as the corpus. - Provider is
supabase-edge(default) orhash(offline dev/tests only). Noonnxruntime/native binary anywhere.
- Stored rows: Automatic Embeddings — a trigger enqueues the row in
- Ranking (
src/ranking.ts) is pure and DB-free: min-max normalize each signal, saturate overlap sums, 1/df weighting, fuse, group to areas. Unit-tested offline.
Project layout:
migrations/ 0001 extensions · 0002 schema · 0003 functions
0004 automatic-embeddings infra · 0005 story triggers (run in Supabase SQL editor)
supabase/functions/
embed/ Automatic Embeddings queue worker (gte-small) — deploy --no-verify-jwt
generate-embedding/ query-time text→vector endpoint (gte-small)
src/
config.ts env → typed config (weights, tuning, dim)
embeddings.ts supabase-edge (default) | hash providers — query-time only
db.ts read-only postgres client + every SQL statement
ranking.ts pure fusion / 1/df / normalization (unit-tested)
schemas.ts Zod input schemas
tools/ find_related · find_crossover · query_stories
resources.ts schema://taxonomy · docs://how-to-query
http.ts / stdio.ts transports; index.ts entry point
scripts/
ingest.ts staging CSV → normalize → upsert rows + refresh df (no embedding)
test-ranking.ts offline ranking unit tests
smoke.ts in-process end-to-end MCP test (no DB)
integration.ts DB-backed query_stories test (skips without SUPABASE_DB_URL)
data/ bundled staging CSVs (corpus)
evals/ evaluation.xml
Setup
1. Apply the migrations (Supabase SQL editor)
Run in order: 0001_extensions.sql → 0002_schema.sql → 0003_functions.sql →
0004_automatic_embeddings.sql → 0005_story_embedding_triggers.sql.
0001–0003: pgvector + canonical tables (matchingpostgres-import-schema-confirmation.md)embedding vector(384)+ HNSW index +doc_frequency+match_user_stories+refresh_document_frequencies.
0004–0005: the Automatic Embeddings pipeline (pgmq queue,util.*helpers,pg_cronjob) and theuser_storiestriggers that auto-(re)embed on insert/update.
2. Set the Vault secret + deploy the edge functions
Automatic Embeddings needs your project URL in Vault so pg_cron can reach the edge
function. In the SQL editor (replace with Project Settings → API → Project URL):
select vault.create_secret('https://<project-ref>.supabase.co', 'project_url');
Deploy both edge functions (they run Supabase.ai gte-small — no API key):
supabase functions deploy embed --no-verify-jwt # internal worker, called by pg_net
supabase functions deploy generate-embedding # query-time, server sends anon key
3. Configure env
cp .env.example .env
# DB: SUPABASE_DB_URL_INGEST (write role), SUPABASE_DB_URL (read-only role)
# Edge: SUPABASE_URL + SUPABASE_ANON_KEY (server calls generate-embedding)
npm install
What DB credential to provide. This server connects to Postgres over the wire
protocol (via the postgres client), so it needs a Postgres connection string —
not the Supabase anon/service_role API key (those are for the REST/supabase-js
client, which this server doesn't use). Get the string from the Supabase dashboard:
Connect (top bar) or Project Settings → Database → Connection string → choose
the pooler URI and substitute your DB password:
postgresql://postgres.<project-ref>:[YOUR-PASSWORD]@aws-0-<region>.pooler.supabase.com:6543/postgres
Recommended: a read-only role for the running server, the default postgres role
for the one-time ingest. Create a read-only role in the SQL editor:
create role mcp_readonly login password '<pick-one>';
grant connect on database postgres to mcp_readonly;
grant usage on schema public to mcp_readonly;
grant select on all tables in schema public to mcp_readonly;
grant execute on function match_user_stories(vector, int) to mcp_readonly;
alter default privileges in schema public grant select on tables to mcp_readonly;
Then point SUPABASE_DB_URL at that role (swap the user + password in the string above).
4. Ingest the corpus
npm run ingest
# parses data/*.csv (auto-repairs any rows with unescaped commas in `title`),
# normalizes slugs/paths, upserts rows, refreshes df. NO embedding happens here —
# inserting the rows fires the Automatic Embeddings triggers; ingest then kicks
# util.process_embeddings() and pg_cron fills the vectors within ~10-30s.
Ingest prints how many stories are still awaiting an embedding; it should reach 0
shortly. (If it can't reach the edge function, check the Vault secret + embed deploy.)
5. Run
npm run build
npm start # stdio (default) — what an mcp-bridge host spawns
# or
npm run start:http # standalone HTTP: POST http://localhost:3000/mcp (TRANSPORT=http)
Verify quickly:
npm run test:ranking # 20 offline ranking assertions
npm run test:smoke # in-process MCP handshake + tools/resources + error paths (no DB)
SUPABASE_DB_URL=... npm run test:integration # DB-backed: group_by, filters, applied_filters, suggestions
curl localhost:3000/health
Deploying to the remote MCP platform
Two host models are supported:
-
mcp-bridge host (default): the platform spawns
node dist/index.jsand bridges MCP over its stdio. This is the default transport — noTRANSPORTenv needed. Defaulting to stdio here is deliberate: an HTTP default makes each spawned child bind port 3000 and crash-loop onEADDRINUSE, so the transport must not depend on an env var surviving redeploys. -
Standalone HTTP gateway: set
TRANSPORT=http; the gateway routes MCP traffic toPOST /mcpand may useGET /healthfor liveness. -
Env to set on the host:
SUPABASE_DB_URL(read-only role),SUPABASE_URL+SUPABASE_ANON_KEY(for the query-timegenerate-embeddingcall),PORT.EMBEDDING_PROVIDERdefaults tosupabase-edge. -
Auth: none in this server by design — the hosting gateway terminates auth in front of it. Do not expose
/mcpto the public internet directly. -
Scaling: stateless, so any number of instances behind a load balancer is fine. No model in-process, so no cold-start model load;
find_relatedmakes one HTTPS call to the gte-small edge function per request (~100–300ms).
Configuration & tuning
| Env | Default | Purpose |
|---|---|---|
SUPABASE_DB_URL |
— | Postgres connection string (read-only role) for the server |
SUPABASE_URL / SUPABASE_ANON_KEY |
— | call the generate-embedding edge fn |
EMBEDDING_PROVIDER |
supabase-edge |
supabase-edge | hash (dev/tests) |
EMBEDDING_DIM |
384 |
must match the vector(N) column + gte-small |
CANDIDATE_POOL_SIZE |
50 |
KNN pool size (K) before overlap fusion |
FIND_RELATED_MIN_SCORE |
0.15 |
absolute floor; below it → empty result |
PORT / TRANSPORT |
3000 / stdio |
TRANSPORT=http for the standalone HTTP server |
Want OpenAI instead of gte-small? Swap the model in both edge functions to
text-embedding-3-smalland change everyvector(384)→vector(1536)in0002/0003, then re-ingest. The 384-dim gte-small path needs no API key.
Fusion weights per mode live in src/config.ts (semantic = vector only,
structural = path-heavy, blended = 0.5/0.25/0.25). Tune against an eval set.
How this maps to the PRD
| PRD requirement (P0) | Where |
|---|---|
| Remote MCP over HTTP/SSE, hosted by us | src/http.ts (streamable HTTP, stateless) |
| Connects to Supabase Postgres + pgvector | src/db.ts, migrations/ |
| Ingest: normalize slugs/paths into join tables; embeddings via Supabase | scripts/ingest.ts, 0002–0005, supabase/functions/ |
| Embedding = on-write trigger (PRD open question resolved) | Automatic Embeddings, 0004/0005 |
find_related blended default, KNN-gate-before-overlap, code-vs-prose fork |
src/tools/find_related.ts, src/ranking.ts |
find_crossover 1/df weighting + "why" in every result |
src/tools/find_crossover.ts, src/db.ts |
query_stories whitelisted, read-only, AND-combined, enforced LIMIT |
src/tools/query_stories.ts, src/db.ts |
| Self-describing responses (keys + titles + code paths) | all tool payloads |
| pgvector HNSW (not IVFFlat) | 0002_schema.sql |
| Min-max normalize each signal before weighting | src/ranking.ts |
| Two static resources only | src/resources.ts |
Note on the corpus: the live staging data is larger than the PRD's stated figures
and is still being actively backfilled. The bundled snapshot in data/ is
204 stories · 33 sections · 534 entity slugs · 284 code paths (109 shared). The
design holds (e.g. folder is a df≈35 hub tag, exactly the PRD's example; watermark
≈8). Status/actor are near-uniform (production/photographer) — PRD open question
#Data; query_stories filters are implemented and ready for when that distribution
widens. Re-run npm run ingest whenever the CSVs are refreshed.
Deferred (per PRD): interaction-pattern slug facet (P1), candidate_pool_size
exposed as a tool param (P1), result caching (P1), run_retrieval_codemode and
materialized multi-hop graph (P2). Retrieval primitives are factored as reusable
functions so these stay cheap to add.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.