Knowledge Base MCP Server
Provides tools for listing and retrieving content from different knowledge bases using semantic search capabilities.
jeanibarz
Tools
list_knowledge_bases
Lists the available knowledge bases.
retrieve_knowledge
Retrieves similar chunks from the knowledge base based on a query. Optionally, if a knowledge base is specified, only that one is searched; otherwise, all available knowledge bases are considered. By default, at most 10 documents are returned with a score below a threshold of 2. A different threshold can optionally be provided.
README
Knowledge Base MCP Server
This MCP server provides tools for listing and retrieving content from different knowledge bases.
<a href="https://glama.ai/mcp/servers/n0p6v0o0a4"> <img width="380" height="200" src="https://glama.ai/mcp/servers/n0p6v0o0a4/badge" alt="Knowledge Base Server MCP server" /> </a>
Setup Instructions
These instructions assume you have Node.js and npm installed on your system.
Prerequisites
-
Clone the repository:
git clone <repository_url> cd knowledge-base-mcp-server
-
Install dependencies:
npm install
-
Configure environment variables:
- The server requires the
HUGGINGFACE_API_KEY
environment variable to be set. This is the API key for the Hugging Face Inference API, which is used to generate embeddings for the knowledge base content. You can obtain a free API key from the Hugging Face website (https://huggingface.co/). - The server requires the
KNOWLEDGE_BASES_ROOT_DIR
environment variable to be set. This variable specifies the directory where the knowledge base subdirectories are located. If you don't set this variable, it will default to$HOME/knowledge_bases
, where$HOME
is the current user's home directory. - The server supports the
FAISS_INDEX_PATH
environment variable to specify the path to the FAISS index. If not set, it will default to$HOME/knowledge_bases/.faiss
. - The server supports the
HUGGINGFACE_MODEL_NAME
environment variable to specify the Hugging Face model to use for generating embeddings. If not set, it will default tosentence-transformers/all-MiniLM-L6-v2
. - You can set these environment variables in your
.bashrc
or.zshrc
file, or directly in the MCP settings.
- The server requires the
-
Build the server:
npm run build
-
Add the server to the MCP settings:
- Edit the
cline_mcp_settings.json
file located at/home/jean/.vscode-server/data/User/globalStorage/saoudrizwan.claude-dev/settings/
. - Add the following configuration to the
mcpServers
object:
"knowledge-base-mcp": { "command": "node", "args": [ "/path/to/knowledge-base-mcp-server/build/index.js" ], "disabled": false, "autoApprove": [], "env": { "KNOWLEDGE_BASES_ROOT_DIR": "/path/to/knowledge_bases", "HUGGINGFACE_API_KEY": "YOUR_HUGGINGFACE_API_KEY", }, "description": "Retrieves similar chunks from the knowledge base based on a query." },
- Replace
/path/to/knowledge-base-mcp-server
with the actual path to the server directory. - Replace
/path/to/knowledge_bases
with the actual path to the knowledge bases directory.
- Edit the
-
Create knowledge base directories:
- Create subdirectories within the
KNOWLEDGE_BASES_ROOT_DIR
for each knowledge base (e.g.,company
,it_support
,onboarding
). - Place text files (e.g.,
.txt
,.md
) containing the knowledge base content within these subdirectories.
- Create subdirectories within the
- The server recursively reads all text files (e.g.,
.txt
,.md
) within the specified knowledge base subdirectories. - The server skips hidden files and directories (those starting with a
.
). - For each file, the server calculates the SHA256 hash and stores it in a file with the same name in a hidden
.index
subdirectory. This hash is used to determine if the file has been modified since the last indexing. - The file content is splitted into chunks using the
MarkdownTextSplitter
fromlangchain/text_splitter
. - The content of each chunk is then added to a FAISS index, which is used for similarity search.
- The FAISS index is automatically initialized when the server starts. It checks for changes in the knowledge base files and updates the index accordingly.
Usage
The server exposes two tools:
list_knowledge_bases
: Lists the available knowledge bases.retrieve_knowledge
: Retrieves similar chunks from the knowledge base based on a query. Optionally, if a knowledge base is specified, only that one is searched; otherwise, all available knowledge bases are considered. By default, at most 10 document chunks are returned with a score below a threshold of 2. A different threshold can optionally be provided using thethreshold
parameter.
You can use these tools through the MCP interface.
The retrieve_knowledge
tool performs a semantic search using a FAISS index. The index is automatically updated when the server starts or when a file in a knowledge base is modified.
The output of the retrieve_knowledge
tool is a markdown formatted string with the following structure:
## Semantic Search Results
**Result 1:**
[Content of the most similar chunk]
**Source:**
```json
{
"source": "[Path to the file containing the chunk]"
}
```
---
**Result 2:**
[Content of the second most similar chunk]
**Source:**
```json
{
"source": "[Path to the file containing the chunk]"
}
```
> **Disclaimer:** The provided results might not all be relevant. Please cross-check the relevance of the information.
Each result includes the content of the most similar chunk, the source file, and a similarity score.
Recommended Servers
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
serper-search-scrape-mcp-server
This Serper MCP Server supports search and webpage scraping, and all the most recent parameters introduced by the Serper API, like location.
The Verge News MCP Server
Provides tools to fetch and search news from The Verge's RSS feed, allowing users to get today's news, retrieve random articles from the past week, and search for specific keywords in recent Verge content.
Google Search Console MCP Server
A server that provides access to Google Search Console data through the Model Context Protocol, allowing users to retrieve and analyze search analytics data with customizable dimensions and reporting periods.
Crypto Price & Market Analysis MCP Server
A Model Context Protocol (MCP) server that provides comprehensive cryptocurrency analysis using the CoinCap API. This server offers real-time price data, market analysis, and historical trends through an easy-to-use interface.
MCP PubMed Search
Server to search PubMed (PubMed is a free, online database that allows users to search for biomedical and life sciences literature). I have created on a day MCP came out but was on vacation, I saw someone post similar server in your DB, but figured to post mine.
MCP DuckDB Knowledge Graph Memory Server
A memory server for Claude that stores and retrieves knowledge graph data in DuckDB, enhancing performance and query capabilities for conversations with persistent user information.
dbt Semantic Layer MCP Server
A server that enables querying the dbt Semantic Layer through natural language conversations with Claude Desktop and other AI assistants, allowing users to discover metrics, create queries, analyze data, and visualize results.