knowledge_mgmt

knowledge_mgmt

Enables uploading, organizing, and semantically searching documents with support for various file types and embedding providers.

Category
Visit Server

README

AskMyDoc – Knowledge Management Add-on

AskMyDoc makes it simple to upload, organise, and search your documents. Just add your files, and AskMyDoc instantly turns them into a searchable knowledge base so you can ask questions and get clear answers in seconds. It’s fast, easy to set up, and helps you find the information you need without digging through folders.

Features

β€’	πŸ“„ Works with Many File Types: Upload PDFs, Word docs, text files, spreadsheets, and more
β€’	πŸ” Smart Search: Quickly find answers based on meaning, not just keywords
β€’	🧠 Flexible Options: Choose between built-in, cloud, or local AI for powering your search
β€’	πŸ“Š Breaks Down Large Documents: Splits files into easy-to-understand sections for better results
β€’	🏷️ Organise with Tags: Add labels or notes to keep documents easy to find
β€’	πŸ”„ Upload in Bulk: Bring in entire folders of files at once
β€’	πŸš€ Quick to Start: Run instantly without complicated setup

πŸš€ Quick Start (For Non-Technical Users)

Don't worry if you're not technical! This guide will walk you through everything step-by-step using your computer's normal file manager and text editor.

What You'll Need:

  • βœ… A computer (Mac or Windows)
  • βœ… Claude Desktop installed
  • βœ… About 10 minutes to set up

What We'll Do:

  1. πŸ“ Create two special folders on your computer
  2. βš™οΈ Tell Claude Desktop where to find these folders
  3. πŸ”„ Restart Claude Desktop
  4. πŸŽ‰ Start asking questions about your documents!

Ready? Let's start! πŸ‘‡

Support

  • Help with installation: If you need help with installation just email hello@biznezstack.com

1. Create Required Folders

You MUST create these folders before starting Claude Desktop. Think of them as special folders where AskMyDoc will store your documents and search data.

πŸ–±οΈ Easy Way: Using Your Computer's File Manager

On Mac:

  1. Open Finder (the folder icon in your dock)
  2. Click on your username in the sidebar (it's usually at the top)
  3. Right-click in an empty space and select "New Folder"
  4. Name it exactly: knowledge-storage
  5. Create another folder and name it exactly: knowledge-chroma

On Windows:

  1. Open File Explorer (the folder icon on your taskbar)
  2. Click on "This PC" in the sidebar
  3. Double-click on "Users" then your username folder
  4. Right-click in an empty space and select "New" β†’ "Folder"
  5. Name it exactly: knowledge-storage
  6. Create another folder and name it exactly: knowledge-chroma

πŸ“ How to Find Your Folder Paths

On Mac:

  1. Open Finder
  2. Click on your username folder
  3. Right-click on the knowledge-storage folder
  4. Hold Option key and select "Copy [folder name] as Pathname"
  5. Paste it somewhere to see the full path (it will look like /Users/YourName/knowledge-storage)

On Windows:

  1. Open File Explorer
  2. Navigate to your username folder (usually C:\Users\YourName\)
  3. Right-click on the knowledge-storage folder
  4. Select "Copy as path"
  5. Paste it somewhere to see the full path (it will look like C:\Users\YourName\knowledge-storage)

πŸ’» Alternative: Using Terminal/Command Prompt (For Advanced Users)

If you're comfortable with command line tools:

Mac/Linux Terminal:

# Create the folders
mkdir -p ~/knowledge-storage
mkdir -p ~/knowledge-chroma

# Get the full paths
echo "$HOME/knowledge-storage"
echo "$HOME/knowledge-chroma"

Windows Command Prompt:

# Create the folders
mkdir %USERPROFILE%\knowledge-storage
mkdir %USERPROFILE%\knowledge-chroma

# Get the full paths
echo %USERPROFILE%\knowledge-storage
echo %USERPROFILE%\knowledge-chroma

2. Configure Claude Desktop

This step tells Claude Desktop where to find AskMyDoc and where to store your documents.

πŸ“ Step 1: Find Your Configuration File

On Mac:

  1. Open Finder
  2. Press Cmd + Shift + G (Go to Folder)
  3. Type: ~/Library/Application Support/Claude/
  4. Press Enter
  5. Look for a file called claude_desktop_config.json

On Windows:

  1. Open File Explorer
  2. In the address bar, type: %APPDATA%\Claude\
  3. Press Enter
  4. Look for a file called claude_desktop_config.json

✏️ Step 2: Edit the Configuration File

  1. Right-click on claude_desktop_config.json
  2. Select "Open with" β†’ "TextEdit" (Mac) or "Notepad" (Windows)
  3. Replace everything in the file with this code:
{
  "mcpServers": {
    "knowledge_mgmt": {
      "command": "npx",
      "args": ["-y", "knowledge-mgmt-mcp"],
      "env": {
        "STORAGE_DIR": "REPLACE_WITH_YOUR_STORAGE_PATH",
        "CHROMA_DB_DIR": "REPLACE_WITH_YOUR_CHROMA_PATH",
        "EMBEDDING_PROVIDER": "transformers",
        "EMBEDDING_MODEL": "Xenova/all-MiniLM-L6-v2",
        "CHUNK_SIZE": "1000",
        "CHUNK_OVERLAP": "200",
        "CHUNKING_STRATEGY": "sentence"
      }
    }
  }
}

πŸ”„ Step 3: Replace the Paths

You need to replace the placeholder text with your actual folder paths:

  1. Find REPLACE_WITH_YOUR_STORAGE_PATH in the file
  2. Replace it with the path you copied earlier (like /Users/YourName/knowledge-storage)
  3. Find REPLACE_WITH_YOUR_CHROMA_PATH in the file
  4. Replace it with the second path you copied (like /Users/YourName/knowledge-chroma)

πŸ“ Real Examples

If your name is "Sarah" on Mac:

  • Replace REPLACE_WITH_YOUR_STORAGE_PATH with: /Users/Sarah/knowledge-storage
  • Replace REPLACE_WITH_YOUR_CHROMA_PATH with: /Users/Sarah/knowledge-chroma

If your name is "Mike" on Windows:

  • Replace REPLACE_WITH_YOUR_STORAGE_PATH with: C:\Users\Mike\knowledge-storage
  • Replace REPLACE_WITH_YOUR_CHROMA_PATH with: C:\Users\Mike\knowledge-chroma

πŸ’Ύ Step 4: Save the File

  1. Press Cmd + S (Mac) or Ctrl + S (Windows)
  2. Close the text editor

πŸ“‹ What These Folders Do

  • STORAGE_DIR: This is where AskMyDoc keeps copies of your documents and their text
  • CHROMA_DB_DIR: This is where AskMyDoc stores the search index (like a library catalog)

3. Restart Claude Desktop

The server will automatically download and start when Claude needs it.

4. Start Using It

Ask Claude to help you:

  • "Ingest this PDF document for me"
  • "Search my documents for information about X"
  • "List all my ingested documents"
  • "What are the statistics of my knowledge base?"

Configuration Options

Environment Variables

Variable Description Default Options
STORAGE_DIR Directory for document storage ~/.knowledge-mgmt-mcp/storage Any valid path
CHROMA_DB_DIR ChromaDB database directory ~/.knowledge-mgmt-mcp/chroma_db Any valid path
EMBEDDING_PROVIDER Embedding provider to use transformers transformers, openai, cohere
EMBEDDING_MODEL Model to use for embeddings Xenova/all-MiniLM-L6-v2 See below
OPENAI_API_KEY OpenAI API key (if using OpenAI) - Your API key
COHERE_API_KEY Cohere API key (if using Cohere) - Your API key
CHUNK_SIZE Maximum chunk size in characters 1000 100-10000
CHUNK_OVERLAP Overlap between chunks 200 0-500
CHUNKING_STRATEGY Chunking method sentence sentence, paragraph, fixed
MAX_FILE_SIZE Max file size in bytes 104857600 (100MB) Any number
ALLOWED_FILE_TYPES Comma-separated file types pdf,docx,txt,md,csv,json,html Any subset
LOG_LEVEL Logging verbosity INFO DEBUG, INFO, WARN, ERROR

Embedding Provider Options

Transformers.js (Local - No API Key Required)

{
  "EMBEDDING_PROVIDER": "transformers",
  "EMBEDDING_MODEL": "Xenova/all-MiniLM-L6-v2"
}

Available Models:

  • Xenova/all-MiniLM-L6-v2 (384 dimensions, fast)
  • Xenova/all-mpnet-base-v2 (768 dimensions, more accurate)

OpenAI

{
  "EMBEDDING_PROVIDER": "openai",
  "EMBEDDING_MODEL": "text-embedding-3-small",
  "OPENAI_API_KEY": "sk-..."
}

Available Models:

  • text-embedding-3-small (1536 dimensions)
  • text-embedding-3-large (3072 dimensions)
  • text-embedding-ada-002 (1536 dimensions)

Cohere

{
  "EMBEDDING_PROVIDER": "cohere",
  "EMBEDDING_MODEL": "embed-english-v3.0",
  "COHERE_API_KEY": "..."
}

Available Models:

  • embed-english-v3.0 (1024 dimensions)
  • embed-multilingual-v3.0 (1024 dimensions)

Available Tools

1. ingest_document

Ingest a document from file path or raw text.

Parameters:

  • file_path (string, optional): Path to file (mutually exclusive with text_content)
  • text_content (string, optional): Raw text content (mutually exclusive with file_path)
  • metadata (object, optional): Custom metadata key-value pairs
  • tags (array, optional): Tags for categorization

Example:

{
  "file_path": "/path/to/document.pdf",
  "tags": ["research", "2024"],
  "metadata": {
    "author": "John Doe",
    "project": "AI Research"
  }
}

Returns:

{
  "documentId": "doc_1234567890_abc123",
  "chunksCreated": 15,
  "status": "success",
  "message": "Document ingested successfully with 15 chunks"
}

2. search_knowledge

Search the knowledge base semantically.

Parameters:

  • query (string, required): Search query
  • max_results (number, optional): Maximum results (default: 10)
  • similarity_threshold (number, optional): Minimum similarity 0-1 (default: 0.0)
  • filter_metadata (object, optional): Filter by metadata
  • filter_tags (array, optional): Filter by tags

Example:

{
  "query": "What are the key findings about machine learning?",
  "max_results": 5,
  "similarity_threshold": 0.7,
  "filter_tags": ["research"]
}

Returns:

{
  "results": [
    {
      "content": "Machine learning models showed 95% accuracy...",
      "source": "research_paper.pdf",
      "score": 0.89,
      "metadata": {
        "document_id": "doc_1234567890_abc123",
        "file_type": "pdf",
        "tags": ["research", "2024"]
      },
      "chunk_index": 3
    }
  ],
  "total_results": 5
}

3. list_documents

List all ingested documents.

Parameters:

  • tags (array, optional): Filter by tags
  • limit (number, optional): Max documents (default: 50)
  • offset (number, optional): Skip documents (default: 0)

Returns:

{
  "documents": [
    {
      "id": "doc_1234567890_abc123",
      "filename": "research_paper.pdf",
      "file_type": "pdf",
      "tags": ["research", "2024"],
      "chunks_count": 15,
      "created_at": "2024-01-15T10:30:00Z",
      "updated_at": "2024-01-15T10:30:00Z"
    }
  ],
  "total": 1
}

4. get_document

Retrieve full document content and chunks.

Parameters:

  • document_id (string, required): Document ID

Returns:

{
  "id": "doc_1234567890_abc123",
  "content": "Full document text...",
  "filename": "research_paper.pdf",
  "file_type": "pdf",
  "tags": ["research"],
  "chunks": [
    {
      "index": 0,
      "content": "First chunk content...",
      "start_char": 0,
      "end_char": 1000
    }
  ],
  "created_at": "2024-01-15T10:30:00Z",
  "updated_at": "2024-01-15T10:30:00Z"
}

5. delete_document

Delete a document and its chunks.

Parameters:

  • document_id (string, required): Document ID

Returns:

{
  "success": true,
  "message": "Document doc_1234567890_abc123 deleted successfully"
}

6. update_document_metadata

Update document metadata and tags.

Parameters:

  • document_id (string, required): Document ID
  • metadata (object, optional): Custom metadata to update
  • tags (array, optional): Replace existing tags

Returns:

{
  "success": true,
  "message": "Document metadata updated successfully"
}

7. get_collection_stats

Get knowledge base statistics.

Returns:

{
  "total_documents": 10,
  "total_chunks": 150,
  "collection_size": 150,
  "average_chunks_per_document": 15,
  "file_types": {
    "pdf": 5,
    "docx": 3,
    "txt": 2
  }
}

8. batch_ingest

Ingest multiple documents from a directory.

Parameters:

  • directory_path (string, required): Directory path
  • file_patterns (array, optional): Patterns like ["*.pdf", "*.txt"] (default: ["*"])
  • recursive (boolean, optional): Search subdirectories (default: true)

Example:

{
  "directory_path": "/path/to/documents",
  "file_patterns": ["*.pdf", "*.docx"],
  "recursive": true
}

Returns:

{
  "totalFiles": 10,
  "successCount": 9,
  "failedCount": 1,
  "documents": [...],
  "errors": [
    {
      "file": "/path/to/corrupt.pdf",
      "error": "PDF processing failed"
    }
  ]
}

Chunking Strategies

Sentence-Aware (Default)

Splits text at sentence boundaries, respecting natural language structure.

Best for: General documents, articles, research papers

Pros: Maintains semantic coherence Cons: Variable chunk sizes

Paragraph-Aware

Splits text at paragraph boundaries (double newlines).

Best for: Documents with clear paragraph structure

Pros: Preserves document structure Cons: May create very large or small chunks

Fixed-Size

Splits text into fixed-size chunks with overlap.

Best for: Uniform processing, technical documents

Pros: Predictable chunk sizes Cons: May split mid-sentence

Supported File Formats

Format Extension Description
PDF .pdf Portable Document Format
Word .docx Microsoft Word documents
Text .txt Plain text files
Markdown .md Markdown formatted text
CSV .csv Comma-separated values
JSON .json JSON data files
HTML .html, .htm Web pages

Troubleshooting

πŸ“ Folder Not Found Errors

Problem: You see an error like "no such file or directory" or "folder not found"

Solution: The folders don't exist yet. Go back to Step 1 and create them using your computer's file manager:

On Mac:

  1. Open Finder
  2. Click on your username in the sidebar
  3. Right-click and create a new folder called knowledge-storage
  4. Create another folder called knowledge-chroma

On Windows:

  1. Open File Explorer
  2. Go to This PC β†’ Users β†’ your username
  3. Right-click and create a new folder called knowledge-storage
  4. Create another folder called knowledge-chroma

πŸ”’ Permission Denied Errors

Problem: You see "Permission denied" or "Access denied" errors

Solution: The folders might be locked. Try this:

On Mac:

  1. Right-click on the knowledge-storage folder
  2. Select "Get Info"
  3. Make sure "Read & Write" is selected for your user
  4. Do the same for knowledge-chroma

On Windows:

  1. Right-click on the knowledge-storage folder
  2. Select "Properties"
  3. Go to "Security" tab
  4. Make sure your user has "Full control"
  5. Do the same for knowledge-chroma

Server Not Starting

  1. Check logs: Set LOG_LEVEL=DEBUG to see detailed logs
  2. Verify paths: Ensure STORAGE_DIR and CHROMA_DB_DIR are writable
  3. Check Node version: Requires Node.js 18+
node --version  # Should be 18.0.0 or higher

Embedding Generation Slow

Using Transformers.js? First run downloads the model (~100MB). Subsequent runs are fast.

Solution: Use a smaller model like Xenova/all-MiniLM-L6-v2 or switch to OpenAI/Cohere for faster processing.

Out of Memory

Large documents? Reduce CHUNK_SIZE or process files individually instead of batch ingestion.

{
  "CHUNK_SIZE": "500",
  "MAX_FILE_SIZE": "10485760"
}

ChromaDB Connection Error

Solution: Ensure CHROMA_DB_DIR exists and is writable:

mkdir -p ~/knowledge-chroma
chmod 755 ~/knowledge-chroma

File Processing Failures

PDF extraction issues? Some PDFs are scanned images. Use OCR preprocessing before ingestion.

DOCX errors? Ensure file isn't corrupted. Try opening in Word first.

Development

Local Setup

# Clone repository
git clone https://github.com/yourusername/knowledge-mgmt-mcp.git
cd knowledge-mgmt-mcp

# Install dependencies
npm install

# Build
npm run build

# Test locally
npm link

Testing with Claude Desktop

{
  "mcpServers": {
    "knowledge_mgmt_dev": {
      "command": "node",
      "args": ["/absolute/path/to/knowledge-mgmt-mcp/dist/index.js"],
      "env": {
        "LOG_LEVEL": "DEBUG",
        ...
      }
    }
  }
}

Performance Tips

  1. Choose the right embedding provider:

    • Local (Transformers.js): Free, private, slower first run
    • OpenAI: Fast, costs $0.0001/1K tokens
    • Cohere: Fast, costs $0.0001/1K tokens
  2. Optimize chunk size:

    • Smaller chunks (500-800): Better for specific searches
    • Larger chunks (1000-1500): Better for context
  3. Use appropriate chunking strategy:

    • Sentence: Most documents
    • Paragraph: Long-form content
    • Fixed: Technical/structured data
  4. Batch operations:

    • Use batch_ingest for multiple files
    • Process directories rather than individual files

Security Considerations

  • File path validation: Prevents directory traversal attacks
  • Input sanitization: All user inputs are validated
  • API key security: Never log or expose API keys
  • File size limits: Configurable max file size prevents abuse
  • Rate limiting: Consider implementing rate limiting for production

License

MIT License - see LICENSE file for details

Contributing

Contributions welcome! Please open an issue or submit a pull request.

Support

Help with configuration: email hello@biznezstack.com

Changelog

v1.0.0 (Initial Release)

  • Multi-format document ingestion
  • Semantic search with ChromaDB
  • Multiple embedding providers
  • Intelligent chunking strategies
  • Complete MCP tool implementation
  • Comprehensive error handling
  • Full documentation

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured