sf-docs-mcp
Enables LLM agents to search, read, and query Salesforce documentation across 129 domains. Uses a pre-compiled knowledge graph with 53,000 nodes and 18,000 code snippets for instant retrieval without RAG.
README
<p align="center"> <img src="assets/logo.svg" alt="SF-Documentation-Knowledge" width="800" /> </p>
<p align="center"> <a href="https://www.npmjs.com/package/@sfdxy/sf-documentation-knowledge"><img src="https://img.shields.io/npm/v/@sfdxy/sf-documentation-knowledge?style=flat-square&color=34d399" alt="npm version" /></a> <a href="https://github.com/Avinava/sf-documentation-knowledge/actions"><img src="https://img.shields.io/github/actions/workflow/status/Avinava/sf-documentation-knowledge/ci.yml?style=flat-square&color=38bdf8" alt="CI" /></a> <a href="https://www.npmjs.com/package/@sfdxy/sf-documentation-knowledge"><img src="https://img.shields.io/npm/dm/@sfdxy/sf-documentation-knowledge?style=flat-square&color=fbbf24" alt="Downloads" /></a> </p>
<p align="center"> <strong>Collect, process, and serve Salesforce documentation for LLM agents — using Context Engineering + MCP, not RAG.</strong> </p>
Overview
This system programmatically collects all Salesforce documentation from developer.salesforce.com (129 domains, 35,000+ pages), processes it into structured, curated knowledge files, and serves them to LLM agents via:
- Context Engineering — Pre-compiled Markdown files with
_index.mdrouting tables - MCP Server — 12 tools + 4 prompts + 5 resources via Model Context Protocol
- Knowledge Graph — 53,000+ nodes and 450,000+ edges connecting SF concepts, namespaces, services, and cross-references
No embeddings. No vector stores. No blind chunking.
Quick Start
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"sf-docs": {
"command": "npx",
"args": ["-y", "-p", "@sfdxy/sf-documentation-knowledge", "sf-docs-mcp"],
"env": {
"SF_ACTIVE_DOMAINS": "apex-guide,apex-reference,lwc"
}
}
}
}
Remove the
envblock to search all 129 domains. See Domain Restriction for details.
Restart Claude Desktop.
VS Code (GitHub Copilot)
Add to .vscode/mcp.json in your workspace (or globally in VS Code settings):
{
"servers": {
"sf-docs": {
"command": "npx",
"args": ["-y", "-p", "@sfdxy/sf-documentation-knowledge", "sf-docs-mcp"],
"env": {
"SF_ACTIVE_DOMAINS": "apex-guide,apex-reference,lwc"
}
}
}
}
Then use @sf-docs in Copilot Chat to query Salesforce documentation.
Gemini Code Assist / Gemini CLI
Add to your MCP config (~/.gemini/settings.json or project .gemini/settings.json):
{
"mcpServers": {
"sf-docs": {
"command": "npx",
"args": ["-y", "-p", "@sfdxy/sf-documentation-knowledge", "sf-docs-mcp"],
"env": {
"SF_ACTIVE_DOMAINS": "apex-guide,apex-reference,lwc"
}
}
}
}
Cursor
Add in Settings -> MCP Servers -> Add Server:
- Name:
sf-docs - Command:
npx -y -p @sfdxy/sf-documentation-knowledge sf-docs-mcp - Transport:
stdio - Environment:
SF_ACTIVE_DOMAINS=apex-guide,apex-reference,lwc
Windsurf
Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"sf-docs": {
"command": "npx",
"args": ["-y", "-p", "@sfdxy/sf-documentation-knowledge", "sf-docs-mcp"],
"env": {
"SF_ACTIVE_DOMAINS": "apex-guide,apex-reference,lwc"
}
}
}
}
OpenCode
Add to your OpenCode config (~/.config/opencode/config.json or project .opencode/config.json):
{
"mcpServers": {
"sf-docs": {
"command": "npx",
"args": ["-y", "-p", "@sfdxy/sf-documentation-knowledge", "sf-docs-mcp"],
"env": {
"SF_ACTIVE_DOMAINS": "apex-guide,apex-reference,lwc"
}
}
}
}
Any MCP Client (Generic)
Point your MCP client to:
npx -y -p @sfdxy/sf-documentation-knowledge sf-docs-mcp
The server uses stdio transport and is compatible with any MCP client.
Why
-p+sf-docs-mcp?
The package ships two binaries:sf-knowledge(the data pipeline CLI) andsf-docs-mcp(the MCP server). Using-pinstalls the package and then explicitly calls thesf-docs-mcpbinary, ensuring you get the MCP server and not the CLI.
Use from Source
git clone https://github.com/Avinava/sf-documentation-knowledge.git
cd sf-documentation-knowledge
npm install
npm run build
npm run mcp:start
MCP Server
The MCP server loads the full 53k-node knowledge graph and 18,000+ code snippets into memory on startup (~5s) and serves all queries instantly.
Run directly from source:
npm run mcp:start
Or via npx (no clone required):
npx -y -p @sfdxy/sf-documentation-knowledge sf-docs-mcp
Tools (12)
| Tool | Purpose | Example Usage |
|---|---|---|
sf_search |
Search across all SF documentation domains | "Find docs about Platform Events" |
sf_semantic_search |
AI-powered semantic search with NLP query understanding | "how to process records in bulk" |
sf_read_topic |
Read a specific documentation topic's content | Read the SOQL reference page |
sf_graph_query |
Navigate the knowledge graph — related docs, namespaces, services | "Show all docs in the System namespace" |
sf_list_domains |
List all available domains, filter by service category | "List analytics domains" |
sf_apex_lookup |
Look up an Apex class with full documentation | "Look up the String class" |
sf_code_examples |
Find working code snippets by topic, language, or domain | "Show batch apex code examples" |
sf_object_reference |
Look up Salesforce objects and fields (6,500+ ref pages) | "Look up Account.Industry field" |
sf_explain_error |
Decode error messages with context and resolution steps | "Explain UNABLE_TO_LOCK_ROW" |
sf_limits |
Governor limits lookup — exact numbers for 15 categories | "What are SOQL limits?" |
sf_set_active_domains |
Restrict all tools to specific documentation domains | Focus on revenue-cloud only |
sf_suggest_domains |
Suggest relevant domains for a task description | "contract lifecycle management" |
Prompt Templates (4)
| Prompt | What It Does | Arguments |
|---|---|---|
explore_api |
Walk through a Salesforce API — endpoints, auth, best practices | api: API name (e.g., "REST API") |
debug_apex |
Debug an Apex issue — class lookup, error patterns, examples | topic: class/error (e.g., "System.QueryException") |
compare_services |
Compare Salesforce products by documentation coverage | services: categories (e.g., "analytics vs commerce") |
write_apex |
Write production-ready Apex — gathers limits, patterns, examples first | task: what to build (e.g., "batch job to update Accounts") |
Resources (5)
Agents can read these without making a tool call:
| Resource URI | Content |
|---|---|
sf://overview |
System stats, available tools, quick start guide |
sf://domains |
All documentation domains with descriptions |
sf://namespaces |
All Apex namespaces with doc counts |
sf://services |
All service categories with domain counts |
sf://config |
Current domain restriction state and runtime controls |
Domain Restriction
When working on a specific Salesforce product area (e.g., Revenue Cloud, Apex development), you can restrict all tools to only search within relevant domains. This reduces noise and improves result quality.
How It Works
- At startup: Set
SF_ACTIVE_DOMAINSas a comma-separated list of domain IDs in your MCP client config - At runtime: Use
sf_set_active_domainsto change the active domains without restarting - Not set: All 129 domains are searched (default, no breaking change)
- Per-call domain filter outside active set: Returns empty results with a warning (not an error)
- sf_read_topic outside active set: Shows a gentle note but still allows reading
Discovering Domains
# Let the AI suggest domains for your task
sf_suggest_domains("building LWC components with Apex backend")
→ Suggests: lwc, apex-guide, apex-reference, lightning
# Set the suggested domains
sf_set_active_domains(domains: ["lwc", "apex-guide", "apex-reference", "lightning"])
# Check current state
sf_set_active_domains()
# Clear restrictions
sf_set_active_domains(clear: true)
Behavior by Tool
| Tool | Domain Restriction Behavior |
|---|---|
sf_search |
Filters via Orama where clause + keyword fallback filtering |
sf_semantic_search |
Filters via Orama where clause on expanded + original queries |
sf_code_examples |
Filters via CodeIndex domains[] parameter |
sf_graph_query |
Post-filters related, namespace, context, search results |
sf_explain_error |
Domain-aware search + post-filtered keyword results |
sf_apex_lookup |
Warns if apex-reference/apex-guide not in active set |
sf_object_reference |
Warns if sfFieldRef/object-reference not in active set |
sf_list_domains |
Shows all domains, marks active ones with checkmark |
sf_read_topic |
Gentle warning (still allows reads outside active set) |
sf_limits |
No filtering (hardcoded data, no graph search) |
All 129 Domain IDs
See docs/domains.md for the full list organized by service category, or use sf_list_domains at runtime.
Knowledge Base
The repository comes pre-loaded with 35,000+ curated markdown files and a Knowledge Graph (53,000+ nodes, 450,000+ edges) covering 129 domains of Salesforce documentation.
Option A: Context Engineering (File-based)
Point your AI agent to the _index.md file in any domain folder. The index acts as a routing table telling the AI which files contain which topics:
knowledge/current/<domain-name>/_index.md
Each domain folder also has a SKILL.md in skills/<domain-name>/SKILL.md that teaches AI agents how to navigate the knowledge.
Option B: Knowledge Graph
The graph at knowledge/current/graph.json connects all documentation with semantic relationships:
| Edge Type | What It Connects |
|---|---|
references |
Document → Document (52,988 cross-references) |
belongs_to_namespace |
Document → Apex Namespace (143 namespaces) |
belongs_to_service |
Domain → Service Category (16 categories) |
is_type |
Document → DocType (api-reference, developer-guide, concept, etc.) |
tagged_with |
Document → Keyword (22,610 unique keywords) |
contains |
Domain → Document |
Inspect it with:
npm run graph:stats
See Graph Schema Documentation for the full schema with node/edge types, ID conventions, and a visual diagram.
Data Pipeline
To update the knowledge base with the latest Salesforce releases, run the pipeline in order:
Step 1: Discover Available Deliverables
npm run discover
Lists all documentation deliverables available from the Salesforce Index API (~127 deliverables).
Step 2: Collect Raw Data
# Collect a specific domain
npm run collect -- --domain cli-commands
# Collect all configured (P0) domains
npm run collect
# Collect ALL deliverables from the SF index API (121 domains, ~31k pages)
npm run collect -- --discover
Step 3: Process HTML to Markdown
# Process a specific domain
npm run process -- --domain cli-commands
# Process ALL collected domains
npm run process -- --discover
Automatically cleans HTML, strips noise, parses tables, formats code blocks, creates clean Markdown, and redacts any Salesforce tokens or secrets.
Step 4: Generate Knowledge Files & Graph
# Generate ALL collected domains and rebuild the full Knowledge Graph
npm run generate -- --discover
Builds the knowledge graph (cross-references, namespaces, service categories, doctype clustering), generates context files, and updates inventory docs.
Step 5: Inspect the Graph
npm run graph:stats
Full Pipeline (One-liner)
npm run collect -- --discover && npm run process -- --discover && npm run generate -- --discover
CLI Reference
| Command | Description |
|---|---|
npm run discover |
List available SF documentation deliverables |
npm run collect |
Download raw HTML documentation |
npm run process |
Convert HTML → Markdown with tagging |
npm run generate |
Generate knowledge files + graph |
npm run graph:stats |
Analyze the knowledge graph |
npm run mcp:start |
Start the MCP server (stdio) |
npm run build |
Compile TypeScript |
npm run test |
Run test suite |
npm run lint |
Run ESLint |
All pipeline commands support --domain <name> for single-domain processing and --discover for all-domain processing.
CI/CD
| Workflow | Trigger | What It Does |
|---|---|---|
| CI | Push / PR to master | Build, test, lint, MCP smoke test |
| Release | Push v* tag |
Build, test, publish to npm, create GitHub release |
| Update Knowledge | Weekly (Sunday) | Run full pipeline to refresh docs |
Documentation
| Document | Description |
|---|---|
| Architecture | System design, data flow, 4-layer architecture |
| Graph Schema | Node/edge types, ID conventions, query examples |
| Domain Reference | All 129 domains organized by service category |
| Full Inventory | Complete domain list with file counts |
| Repo Development Skill | How to develop and extend this repo |
License
MIT © Avinava
Inventory
<!-- INVENTORY:START -->
| Domain | Description | Status | Files |
|---|---|---|---|
| Salesforce Field Reference Guide | Use this concise reference to quickly look up details of the standard fields for | ✅ Available | 4817 |
| Apex Reference | Apex class library reference — all system classes and methods | ✅ Available | 4623 |
| Connect REST API Developer Guide | Integrate mobile apps, intranet sites, and third-party web applications with Sal | ✅ Available | 2465 |
| Object Reference for the Salesforce Platform | Get details on standard objects so that you can interface with your Salesforce d | ✅ Available | 1777 |
| Revenue Cloud / Agentforce Revenue Management | Product catalog, pricing, billing, Dynamic Revenue Orchestrator | ✅ Available | 1362 |
| OmniStudio | OmniStudio — OmniScripts, FlexCards, DataRaptors, Integration Procedures | ✅ Available | 1297 |
| Public Sector Solutions Developer Guide | Use Public Sector Solutions API and developer resources to unify public service | ✅ Available | 1003 |
| Salesforce Health Cloud Developer Guide | Use the Health Cloud API to configure the Health Cloud console, which helps care | ✅ Available | 833 |
| Marketing Cloud API | Developer documentation for Marketing Cloud APIs | ✅ Available | 809 |
| Life Sciences Cloud Developer Guide | Use the developer resources of Life Sciences Cloud to automate the operations av | ✅ Available | 714 |
| Metadata API | Metadata API — deployment, retrieval, metadata types | ✅ Available | 693 |
| Insurance Developer Guide | Learn more about the developer sources of Insurance to automate the backend work | ✅ Available | 616 |
| Visualforce Developer Guide | Learn how to develop custom user interfaces and apps with Visualforce, a framewo | ✅ Available | 609 |
| Apex Developer Guide | Apex language guide — syntax, triggers, testing, best practices | ✅ Available | 566 |
| Financial Services Cloud Developer Guide | Extend Financial Services Cloud with other Salesforce products using the API and | ✅ Available | 527 |
| Loyalty Management Developer Guide | Use Loyalty Management API and developer resources to create personalized loyalt | ✅ Available | 526 |
| Consumer Goods Cloud Developer Guide | Use APIs and developer resources to configure, customize, and extend the capabil | ✅ Available | 524 |
| CRM Analytics REST API Developer Guide | Describes how to send queries directly to CRM Analytics, access datasets that ha | ✅ Available | 519 |
| Lightning Aura Components Developer Guide | Create Aura components for Salesforce for Android, iOS, and mobile web and Light | ✅ Available | 491 |
| Mobile SDK Development Guide | Build standalone native, React Native, and hybrid mobile apps that access Salesf | ✅ Available | 409 |
| Data Cloud | Data Cloud developer guide — data models, connectors, identity resolution | ✅ Available | 400 |
| Programmatic Marketing Content | Developer documentation for Marketing Cloud Programmatic Content | ✅ Available | 381 |
| ISVforce Guide | Plan, build, and sell AppExchange solutions and consulting services. | ✅ Available | 356 |
| Service Cloud | Service Cloud — cases, knowledge, omni-channel, entitlements | ✅ Available | 344 |
| Tooling API | Tooling API — code coverage, debug logs, custom fields | ✅ Available | 339 |
| Einstein Discovery REST API Developer Guide | Describes how to create and access Einstein Discovery predictions, discovery mod | ✅ Available | 312 |
| Education Cloud Developer Guide | Education Cloud gives you the tools and developer resources you need to support | ✅ Available | 308 |
| REST API | Salesforce REST API — resources, methods, composite, batch | ✅ Available | 308 |
| Nonprofit Cloud Developer Guide | Use APIs and developer resources to configure, customize, and extend the capabil | ✅ Available | 304 |
| Data Prep Recipe REST API Developer Guide | Describes how to retrieve, update, and schedule Data Prep recipes. | ✅ Available | 296 |
| + 99 more domains | See full inventory | ✅ Available | 6,901 |
129 domains | 35,429 knowledge files <!-- INVENTORY:END -->
<p align="center"><sub>Built with <a href="https://github.com/google-deepmind/antigravity">Antigravity</a></sub></p>
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.