sai-roadmap-mcp

sai-roadmap-mcp

An MCP server that exposes certifications, projects, and an AI engineering learning roadmap as callable tools for MCP clients like Claude Desktop.

Category
Visit Server

README

sai-roadmap-mcp

An MCP (Model Context Protocol) server exposing my certifications, projects, and AI engineering roadmap as callable tools — including a semantic search engine built entirely from scratch: no pretrained models, no external embedding API, no high-level ML framework.

Built on the official @modelcontextprotocol/sdk (v1.29.0), stdio transport, with a Python/NumPy subprocess powering the ML tool.

Tools

Tool Description Optional Input
get_profile Basic profile info
get_certifications Certifications, filterable by skill skill: string
get_projects Portfolio projects, filterable by tech stack tech: string
get_roadmap 2026 learning roadmap, filterable by quarter quarter: "Q1"–"Q4"
semantic_search Real semantic retrieval, not keyword matching query: string, top_n: number

The semantic search engine

Pipeline (full implementation in src/ml/lsa.py):

  1. Corpus (src/ml/corpus.py) — 51 natural-language sentences generated from certifications, detailed project descriptions, and roadmap entries.
  2. Bigram phrase detection — pointwise mutual information (PMI), the same idea behind word2vec's original word2phrase tool. Merges tightly-bound word pairs (machine_learning, medireach_ai, artificial_intelligence) into single tokens before training.
  3. TF-IDF weighting — term frequency × smoothed inverse document frequency, computed explicitly.
  4. Truncated SVDnumpy.linalg.svd, manually truncated to the top-k singular vectors (the actual Latent Semantic Analysis step — not sklearn.fit_transform()).
  5. Cosine similarity in the resulting latent space ranks documents against a query.

Two things I got wrong on the first pass, and fixed

Bigram threshold, take one: my first PMI threshold (min_count=2, threshold=3.0) merged 147 bigrams — almost all of them grammatical glue like of_the, by_google, is_a, not real phrases. The bug: with only 51 sentences, raw PMI is noisy, and stopwords weren't excluded from forming pairs. Fixed by excluding stopwords from bigram formation and raising the bar to min_count=3, threshold=15.0 — now produces 19 bigrams, and every single one is a genuine phrase (machine_learning, google_cloud, vibe_coding, iit_bombay).

Query/training vocabulary mismatch: after adding bigram merging, queries were still being tokenized with plain word-splitting — so a query like "machine learning" stayed as two tokens while the trained vocabulary only had the merged machine_learning. Silent mismatch, no error thrown, just quietly worse retrieval. Fixed by persisting the learned merge set alongside the model and applying it identically at query time (apply_bigram_merges() in corpus.py).

Quantitative evaluation — precision@k

Most small ML side-projects show a few queries that "look like they work." src/ml/evaluate.py instead defines 10 hand-labeled queries with explicit relevance judgments and reports precision@k:

Query                                         P@1  P@3  P@5
----------------------------------------------------------------------
python certifications                          1.00  1.00  0.80
cloud computing certifications                 1.00  0.67  0.40
multi agent healthcare assistant               1.00  1.00  1.00
edtech platform for students                   1.00  1.00  1.00
frontend animation and design                  0.00  0.00  0.00
hackathon and competition experience           0.00  0.00  0.00
SQL and database skills                        0.00  0.00  0.00
generative AI and large language models        1.00  0.67  0.40
deep learning quarter in the roadmap           0.00  0.67  0.40
agentic IDE development tools                  1.00  1.00  0.60
----------------------------------------------------------------------
MEAN                                           0.60  0.60  0.46

Three queries scored zero — here's exactly why, diagnosed rather than hand-waved:

  • "frontend animation and design" → query contains "animation" (singular); the corpus only ever says "animations" (plural). Out-of-vocabulary, no signal. This is the classic bag-of-words weakness — no stemming, no lemmatization.
  • "hackathon and competition experience" → same issue: "competition" and "experience" never appear in the corpus at all (it says "Hackathon," not "competition").
  • "SQL and database skills" → no OOV words here, but a genuine ranking failure: idf("skills") == idf("database") (both 3.833) because both happen to appear in exactly one document — IDF can't tell "rare and topically specific" apart from "rare by coincidence" at this corpus size. The word "skills" then drags the ranking toward the wrong (but skills-heavy) document.

These are real, well-understood limitations of small-corpus bag-of-words retrieval, not implementation bugs — and documenting them precisely is more useful than a misleadingly clean demo.

Revisiting word2vec with the bigger corpus

I expanded the corpus 3× (17 → 51 sentences) partly to test whether word-level skip-gram (still in src/ml/word2vec.py, Mikolov et al. 2013, negative sampling, full from-scratch NumPy training) would become viable. Honest result: no.

'cloud' -> [('computing', 0.746), ('oac', 0.728), ('analytics', 0.698), ('infrastructure', 0.645)]   ← coherent
'python' -> [('generative_ai', 0.684), ('workflow', 0.599), ('sql', 0.595)]                            ← noise
'agents' -> [('good', 0.687), ('intensive', 0.681), ('5', 0.636)]                                       ← noise

"cloud" produces a genuinely sensible neighborhood; "python" and "agents" still don't. 51 sentences is closer to viable than 17 was, but word2vec realistically needs thousands of sentences minimum. LSA remains the correct choice for this corpus size — confirmed by actually re-running the experiment, not just assumed.

Setup

git clone https://github.com/saichintamani/sai-roadmap-mcp.git
cd sai-roadmap-mcp
npm install
pip3 install -r requirements.txt --break-system-packages
npm run train       # builds corpus, detects bigrams, trains LSA model
npm run evaluate    # runs the precision@k evaluation above

Running standalone

npm start

Communicates over stdio via JSON-RPC 2.0. Ready message goes to stderr; stdout is reserved for protocol messages.

Connecting to Claude Desktop

{
  "mcpServers": {
    "sai-roadmap": {
      "command": "node",
      "args": ["/absolute/path/to/sai-roadmap-mcp/src/index.js"]
    }
  }
}

Connecting to Claude Code

claude mcp add sai-roadmap -- node /absolute/path/to/sai-roadmap-mcp/src/index.js

Repo structure

sai-roadmap-mcp/
├── src/
│   ├── index.js          # MCP server: 5 tools, stdio transport
│   ├── data.json          # Structured portfolio data
│   └── ml/
│       ├── corpus.py      # Sentence generation + PMI bigram detection
│       ├── word2vec.py    # Skip-gram + negative sampling (kept, documented as non-viable here)
│       ├── lsa.py          # TF-IDF + truncated SVD -- the model actually used
│       ├── train.py        # Trains and saves the model + merge set
│       ├── query.py         # CLI query interface, called by index.js
│       ├── evaluate.py       # Precision@k evaluation against hand-labeled queries
│       └── lsa_model.npz     # Pre-trained weights
├── requirements.txt
├── package.json
└── README.md

Why this exists

Most student AI portfolios show using an LLM API. This shows three different things: understanding of the MCP protocol layer production AI tools run on; a working classical NLP/ML retrieval system built from first principles; and — maybe more importantly — the engineering discipline to measure it, find the failure modes, and document them precisely instead of cherry-picking examples that look good.

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured