Enterprise Knowledge MCP Server
Enables querying enterprise documents (DOCX, PDF, PPTX) using natural language, with hybrid search and MCP integration for Claude Desktop and other agents.
README
Enterprise Knowledge MCP Server
Unstructured Data Pipeline & Remote MCP Server — a production-ready enterprise document knowledge base. Ingests DOCX / PDF / PPTX, parses with Docling, cleans and chunks with metadata, indexes into a hybrid search store, and exposes a Remote MCP Server for Claude Desktop and other agents.
Status
Built incrementally, one step at a time (see CLAUDE.md for the full plan).
- [x] Step 1 — Project Bootstrap: FastAPI + FastMCP + Docker + Pytest
- [x] Step 2 — Document Upload:
POST/GET /documents+ persistent catalogue (DOCX/PDF/PPTX); upload now auto-runs the full pipeline and indexes immediately (no restart) - [x] Step 3 — Docling Parser: Docling -> structured
ParsedDocument(headings/text/tables/figures, page & slide provenance) - [x] Step 4 — Cleaning Pipeline: strip repeated headers/footers, page numbers, empty/symbol-only noise (structure preserved)
- [x] Step 5 — Metadata-aware Chunking: semantic chunks (section/table/figure) with full metadata; no fixed-width cuts
- [x] Step 6 — Chroma Indexing: BGE dense embeddings into embedded persistent Chroma (
index/search/get/delete) - [x] Step 7 — Hybrid Retrieval: dense (BGE) + sparse (BM25) fused with RRF; mixed CN/EN tokenizer
- [x] Step 8 — MCP Tools:
search_documents/list_documents/get_document/get_chunkon FastMCP - [x] Step 9 — MCP Resources:
documents://allanddocuments://{document_id} - [x] Step 10 — Integration Test: end-to-end MCP protocol test (
Client-> server -> tool/resource) + runnable client demo
Architecture (target)
DOCX / PDF / PPTX
-> Docling Parser
-> Cleaning Pipeline
-> Metadata-aware Chunking
-> Hybrid Search Index (BGE dense + BM25 sparse, Chroma)
-> Remote MCP Server (FastMCP)
-> Claude Desktop
Tech Stack
| Area | Choice |
|---|---|
| Language | Python 3.11 |
| API | FastAPI |
| Parsing | Docling |
| Search | Hybrid Retrieval |
| Dense Retrieval | BGE Embedding |
| Sparse Retrieval | BM25 |
| Vector DB | Chroma (embedded) |
| MCP Framework | FastMCP |
| Deployment | Docker |
| Testing | Pytest |
Quick Start (local)
This repo ships a pre-created virtual environment (kb_mcp_env/, Windows).
# Install dependencies (incl. dev/test extras)
kb_mcp_env\Scripts\python.exe -m pip install -e ".[dev]"
# Run the tests
kb_mcp_env\Scripts\python.exe -m pytest -q
# Run the server
kb_mcp_env\Scripts\python.exe -m uvicorn app.main:app --reload
- Health check: http://localhost:8000/health
- Remote MCP endpoint: http://localhost:8000/mcp
Run with Docker
cp .env.example .env # optional
docker compose up --build
Brings up the app on port 8000. Chroma runs embedded in-process (no separate
service); its data persists in the chroma_storage Docker volume.
Parsing & OCR
提醒:OCR 是在解析時對每張圖跑,26 張圖會增加數十秒解析時間。若某類文件不需要,可在
.env設OCR_IMAGES=false關閉。
提醒:預設 embedding 模型為
BAAI/bge-m3(多語,適合中英混雜,1024 維、約 2.2GB,首次會下載)。若只需英文且要更輕量,可在.env設EMBEDDING_MODEL=BAAI/bge-small-en-v1.5。切換模型若維度不同,需先清空chroma_storage重新索引。
Example Queries (target MCP tools)
What is the yield improvement plan?
Show me the KPI table from Q4 report.
Summarize slide 5.
Add a document (auto-indexed, no restart)
POST /documents saves the file and runs the full pipeline (Docling parse ->
clean -> metadata-aware chunk -> BGE embed -> Chroma index) in the same process,
then refreshes BM25. Because it shares the MCP server's vector-store/retriever
singletons, the document is searchable over MCP immediately — no restart needed.
# server running on :8000
curl.exe -X POST http://127.0.0.1:8000/documents -F "file=@E:\path\to\report.pdf"
# -> 201 {"document_id": "...", "status": "indexed", "num_chunks": 42, ...}
The response carries status (indexed, or failed with HTTP 500 if parsing
errors — the file is still recorded) and num_chunks. The call blocks until
indexing finishes (Docling/OCR/embedding can take tens of seconds for large or
image-heavy files). scripts/ingest_file.py shares the same pipeline for
command-line ingestion.
Verify the MCP Server (client demo + server log)
Start the server, then drive it over the real MCP protocol with the bundled client demo:
# Terminal 1 — run the server
kb_mcp_env\Scripts\python.exe -m uvicorn app.main:app --host 127.0.0.1 --port 8000
# Terminal 2 — connect a remote MCP client and run the example queries
kb_mcp_env\Scripts\python.exe tests\mcp_client_demo.py
# ...or pass your own query:
kb_mcp_env\Scripts\python.exe tests\mcp_client_demo.py "yield improvement plan"
The demo connects (Client -> MCP Server -> search_documents -> Result), lists
the server's tools/resources, reads documents://all, and prints the retrieved
chunks. Meanwhile the server console logs each invocation:
INFO:app.mcp_server:MCP tool invoked: search_documents | query='...' top_k=3
INFO:app.mcp_server:search_documents retrieved 3 chunk(s): [...]
The hermetic equivalent (no running server, isolated temp index) is the pytest integration test:
kb_mcp_env\Scripts\python.exe -m pytest tests\test_mcp_integration.py -q
AI Workflow
This project is developed with an AI-only workflow (Claude Code + MCP). Each
development step follows: plan -> implement -> review -> test, with a dedicated
commit per step. See CLAUDE.md for the step-by-step record.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.