genai-lab

genai-lab

Exposes tools for semantic search and RAG-based Q\&A from a local knowledge base, along with resources and prompt templates.

Category
Visit Server

README

GenAI Lab

GenAI Lab implements a TypeScript-based GenAI agent workflow with semantic search, Retrieval-Augmented Generation (RAG), MCP tools/resources/prompts, and LangGraph-based orchestration.

The project focuses on the technical mechanics behind controlled GenAI systems: retrieval, grounding, structured planning, graph-based routing, bounded retries, and MCP tool exposure.

Technical Capabilities

  • Embedding-based semantic search over a local knowledge base
  • Retrieval-Augmented Generation with source-grounded responses
  • Source citation support using note IDs
  • LangGraph workflow orchestration with explicit shared state
  • Structured LLM planning with constrained execution steps
  • Retrieval-query extraction before semantic search
  • Conditional graph routing based on planner output and retrieval state
  • Decomposed RAG flow: retrieval node followed by answer/draft generation nodes
  • Bounded retry path for low/no retrieval results
  • Safe fallback behavior for unsupported or ungrounded requests
  • MCP server exposing tools, resources, and prompt templates
  • AI SDK MCP client flow for LLM-driven MCP tool selection

System Architecture

Agent workflow

User request
  → planner node
  → conditional routing
  → retrieve context when needed
  → answer question OR draft message
  → retry/fallback when needed
  → final response

MCP flow

LLM client
  → discovers MCP tools
  → chooses search_notes / answer_from_notes
  → MCP server executes backend logic
  → result returns to LLM
  → final response

Setup

Install dependencies:

pnpm install

This project requires an OpenAI API key.

Create .env.local:

OPENAI_API_KEY=your_openai_api_key

Supported Example Flows

Flow Path What it demonstrates Example command
Grounded Q&A flow planner → retrieve_notes → answer_question → final_response Answers using retrieved context and source citations pnpm agent "Can you explain why retrieval should happen before generation?"
Retrieval-augmented drafting flow planner → retrieve_notes → draft_message → final_response Retrieves relevant context before generating a draft pnpm agent "Use my notes about token-heavy conversations to write a short team update"
Direct drafting flow planner → draft_message → final_response Generates a draft without retrieval when no knowledge lookup is needed pnpm agent "Draft a Slack message saying QA signoff is pending"
Retrieval-only flow planner → retrieve_notes → final_response Searches notes and returns grounded matching context pnpm agent "Search saved notes for deterministic backend functions"
Fallback flow planner → final_response Avoids unsupported answers when no tool/knowledge path applies pnpm agent "Tell me something funny"
Standalone semantic search query → embedding → similarity ranking → notes Runs vector similarity search directly pnpm semantic-search "backend-defined tools and input schemas"
Standalone RAG flow question → retrieve context → generate answer → cite source Runs retrieval and answer generation without LangGraph pnpm rag "Why do long chats become more expensive?"
MCP tool-selection flow LLM → MCP tools → selected tool → tool result → final answer Lets the LLM choose MCP-exposed tools pnpm mcp:llm "Explain how teams reduce LLM cost in long conversations"

Quick Demo

pnpm agent "Can you explain why retrieval should happen before generation?"

pnpm agent "Use my notes about token-heavy conversations to write a short team update"

pnpm agent "Draft a Slack message saying QA signoff is pending"

pnpm agent "Search saved notes for deterministic backend functions"

pnpm agent "Tell me something funny"

pnpm semantic-search "backend-defined tools and input schemas"

pnpm rag "Why do long chats become more expensive?"

pnpm mcp:llm "Explain how teams reduce LLM cost in long conversations"

Tech Stack

  • TypeScript
  • Vercel AI SDK
  • OpenAI models via @ai-sdk/openai
  • LangGraph
  • Model Context Protocol TypeScript SDK
  • Zod
  • pnpm

Project Structure

lib/
  agent/
    agent-state.ts
    mini-agent.ts
  embedding.ts
  knowledge-base.ts
  retrieve-context.ts
  semantic-search.ts
  rag-answer.ts
  rag-types.ts
  vector-utils.ts

mcp/
  server.ts
  client-test.ts
  llm-client-test.ts

scripts/
  agent.ts
  rag.ts
  semantic-search.ts

Implementation Notes

1. Semantic search

The project embeds notes and queries, then ranks notes by vector similarity.

query
→ embedding
→ similarity search
→ ranked notes

2. RAG

The RAG flow retrieves relevant context before generating an answer.

question
→ retrieve context
→ generate grounded answer
→ include source citation

3. LangGraph orchestration

The agent workflow uses LangGraph to keep explicit state across nodes.

Example plans:

retrieve_notes → answer_question → final_response
retrieve_notes → draft_message → final_response
draft_message → final_response
final_response

4. Retrieval-query extraction

The planner extracts a focused retrieval query instead of sending the full user request to semantic search.

Example:

User request:
Use my notes about token-heavy conversations to write a short team update

Retrieval query:
token-heavy conversations

5. Decomposed RAG

RAG is split into retrieval and generation steps inside the agent workflow.

retrieve context
→ answer or draft from retrieved context
→ final response

This keeps retrieval, generation, routing, and fallback behavior visible and independently controllable.

6. Conditional routing

The graph routes based on the current state.

if context found and answer requested → answer_question
if context found and draft requested → draft_message
if no context → retry or fallback

7. Bounded retry

When retrieval fails, the agent retries once with a broader query and lower score threshold.

focused query fails
→ retry with broader query
→ succeed or stop safely

8. Safe fallback

Unsupported or unrelated requests return a bounded fallback instead of generating unsupported answers from missing context.

Example:

pnpm agent "Tell me something funny"

Expected behavior:

returns a bounded fallback instead of answering from unsupported context

MCP Interface

The MCP server exposes:

MCP Primitive Name Purpose
Tool search_notes Search saved notes
Tool answer_from_notes Answer questions from notes
Resource notes://all Read local knowledge base
Prompt rag_answer_prompt Prompt template for grounded answers

The LangGraph agent and MCP examples are intentionally kept as separate flows in this repo.

  • The LangGraph flow demonstrates controlled agent orchestration with state, planning, routing, retrieval, drafting, retries, and fallback behavior.
  • The MCP flow demonstrates how the same search/RAG capabilities can be exposed to external MCP clients as tools, resources, and prompts.

In a larger application, these patterns can be combined by having LangGraph nodes call MCP tools for external capabilities such as GitHub, Jira, Slack, or Confluence.

Knowledge Base

The local knowledge base contains notes about:

  • RAG basics
  • token cost in chat apps
  • workflow routing
  • tool calling
  • semantic search

The knowledge base is intentionally small so retrieval, ranking, grounding, and routing behavior are easy to inspect.

Design Scope

Current scope:

  • local knowledge base
  • embedding-based semantic search
  • RAG with source citations
  • CLI scripts instead of a web UI
  • MCP over stdio
  • controlled LangGraph workflow

The focus is on the core mechanics of retrieval, grounding, planning, routing, MCP tool exposure, and bounded agent behavior.

Next Improvements

  • Improve CLI output formatting
  • Add clearer docs for LangGraph state and routing
  • Add sample output snapshots
  • Add basic tests for retrieval and RAG behavior

TODO

Short Term

  • Add more knowledge-base examples
  • Add an eval script for expected retrieval results
  • Add safer handling for low-confidence retrieval
  • Add structured output for final agent responses

Future Extensions

  • Database-backed vector search
  • Chunking for longer documents
  • Richer source metadata
  • External integrations
  • Guardrails and approval workflows
  • Observability logs
  • Eval suite

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured