
PocketMCP
A lightweight, local-first MCP server that automatically watches folders, chunks and embeds files using Transformers.js, and exposes semantic search capabilities to VS Code and Cursor. Runs completely offline with SQLite vector storage, designed for resource-constrained environments.
README
PocketMCP
PocketMCP is a lightweight, local-first MCP (Model Context Protocol) server that automatically watches folders, chunks and embeds files locally using Transformers.js with MiniLM, stores vectors in SQLite + sqlite-vec, and exposes semantic search capabilities to VS Code and Cursor. Designed for small machines (I'm running on an Intel N100 with 16GB RAM) with zero external dependencies after initial model download.
🌟 Features
- 🔍 Semantic Search: Find content by meaning, not just keywords
- 📁 Auto-Ingestion: Watches folders and automatically processes new/changed files
- 📄 Multi-Format Support: PDF, DOCX, Markdown, and plain text files
- ⚡ Local-First: Runs completely offline after initial model download
- 🗄️ SQLite Storage: Fast, reliable vector storage with sqlite-vec extension
- 🔧 MCP Integration: Native support for VS Code and Cursor via MCP protocol
- 🌐 Web Interface: Built-in web tester for validation and manual testing
- 💾 Efficient: Designed for resource-constrained environments
- 🔄 Real-time: Debounced file watching with smart concurrency limits
- 📊 Smart Segmentation: Page-aware PDF processing and section-aware DOCX handling
- 🛡️ Robust Error Handling: Graceful handling of encrypted, corrupted, or oversized files
Screenshots
📋 Table of Contents
- Quick Start
- Web Tester
- MCP Client Integration
- API Reference
- Configuration
- Development
- Architecture
- Troubleshooting
🚀 Quick Start
1. Installation
# Clone or download the project
cd PocketMCP
# Install dependencies
pnpm install
# Setup environment
pnpm setup
# Or manually: cp .env.sample .env
2. Configuration
Edit .env
file:
# SQLite database path
SQLITE_PATH=./data/index.db
# Directory to watch for file changes (optional)
WATCH_DIR=./kb
# Embedding model (default is recommended)
MODEL_ID=Xenova/all-MiniLM-L6-v2
# Chunking configuration
CHUNK_SIZE=1000
CHUNK_OVERLAP=120
3. Create Content Directory
# Create directory for your documents
mkdir -p kb
# Add some markdown or text files
echo "# My First Document" > kb/test.md
echo "This is a sample document for testing PocketMCP." >> kb/test.md
4. Start the Server
Option A: MCP Server Only
PocketMCP now supports multiple transport modes:
# Development - MCP server with both transports + file watching
pnpm dev:mcp
# Production - MCP server with both transports + file watching
pnpm build && pnpm start
Transport Modes:
stdio
: Standard MCP protocol over stdin/stdout (for VS Code, Cursor)http
: Streamable HTTP transport with CORS support (for web clients, LAN access)both
: Run both transports simultaneously (recommended for production)
HTTP Transport Endpoints:
- MCP:
http://0.0.0.0:8001/mcp
(Streamable HTTP MCP protocol) - Health:
http://0.0.0.0:8001/health
(JSON health check)
Environment Variables:
TRANSPORT
:stdio
|http
|both
(default:both
)HTTP_HOST
: HTTP bind address (default:0.0.0.0
)HTTP_PORT
: HTTP port (default:8001
)LOG_LEVEL
:debug
|info
|warn
|error
(default:info
)
Option B: Web Interface + API Server
# Start web interface and API server for testing
pnpm dev
On first run, the server will download the MiniLM model (~100MB) and then process any files in your watch directory.
🌐 Web Tester
PocketMCP includes a comprehensive web interface for testing and validation.
Access Points
- Web Interface: http://127.0.0.1:5173
- API Server: http://127.0.0.1:5174
- Health Check: http://127.0.0.1:5174/health
Features
📊 Database Diagnostics Panel
- Real-time database status monitoring
- Table counts and vector dimensions
- SQLite WAL mode verification
- Error detection and reporting
- One-click smoke testing
🔍 Search Panel
- Interactive semantic search testing
- LIKE vs Vector search modes
- Configurable result count (top-K)
- Detailed result inspection
- Performance metrics (response time)
📄 Documents Panel
- Browse all indexed documents
- Pagination support
- Document metadata display
- Creation and update timestamps
🔎 Chunk Viewer
- Detailed chunk inspection modal
- Full text content display
- Metadata and offset information
- Copy-to-clipboard functionality
API Endpoints
Endpoint | Method | Description |
---|---|---|
/health |
GET | Server health check |
/api/db/diag |
GET | Database diagnostics |
/api/search |
POST | Semantic search |
/api/chunk/:id |
GET | Get specific chunk |
/api/docs |
GET | List documents |
Example API Usage
Search Documents:
curl -X POST http://127.0.0.1:5174/api/search \
-H "Content-Type: application/json" \
-d '{"query": "machine learning", "top_k": 5, "mode": "like"}'
Get Diagnostics:
curl http://127.0.0.1:5174/api/db/diag | jq .
🔧 MCP Client Integration
PocketMCP supports both stdio and HTTP transports for maximum compatibility.
Option A: Stdio Transport (Recommended for Desktop Clients)
Cursor Integration:
- Open Cursor Settings → MCP
- Add a new server with these settings:
{
"command": "pnpm",
"args": ["dev:mcp"],
"cwd": "/path/to/PocketMCP",
"env": {
"TRANSPORT": "stdio",
"SQLITE_PATH": "./data/index.db",
"WATCH_DIR": "./kb",
"MODEL_ID": "Xenova/all-MiniLM-L6-v2"
}
}
VS Code Integration:
For VS Code clients that support MCP, add to your settings:
{
"mcpServers": {
"pocketmcp": {
"command": "pnpm",
"args": ["dev:mcp"],
"cwd": "/path/to/PocketMCP",
"env": {
"TRANSPORT": "stdio",
"SQLITE_PATH": "./data/index.db",
"WATCH_DIR": "./kb",
"MODEL_ID": "Xenova/all-MiniLM-L6-v2"
}
}
}
}
Production: Direct Node Execution
{
"command": "node",
"args": ["dist/cli.js"],
"cwd": "/path/to/PocketMCP",
"env": {
"TRANSPORT": "stdio",
"SQLITE_PATH": "./data/index.db",
"WATCH_DIR": "./kb"
}
}
Option B: HTTP Transport (For Web Clients & Remote Access)
Start PocketMCP Server:
First, start PocketMCP with HTTP transport enabled:
# Development
pnpm dev:mcp
# Or production
pnpm build && pnpm start
# Or HTTP only
TRANSPORT=http pnpm dev:mcp
MCP Client Configuration (HTTP):
For MCP clients that support HTTP transport, configure the connection:
{
"mcpServers": {
"pocketmcp": {
"transport": "http",
"url": "http://localhost:8001/mcp",
"headers": {
"Content-Type": "application/json"
}
}
}
}
Web Client Integration:
For web applications using MCP over HTTP:
// Example: Connect to PocketMCP via HTTP
const mcpClient = new MCPClient({
transport: 'http',
url: 'http://localhost:8001/mcp',
headers: {
'Content-Type': 'application/json'
}
});
// Initialize connection
await mcpClient.connect();
// Use MCP tools
const searchResults = await mcpClient.callTool('search', {
query: 'machine learning',
top_k: 5
});
Remote/LAN Access:
To access PocketMCP from other machines on your network:
# Start with network binding
HTTP_HOST=0.0.0.0 HTTP_PORT=8001 pnpm dev:mcp
# Then connect from other machines using your server's IP
# http://192.168.1.100:8001/mcp
Health Check:
Test the HTTP transport:
# Health check
curl http://localhost:8001/health
# Expected response
{"status":"ok","timestamp":"2024-01-01T00:00:00.000Z"}
📚 API Reference
MCP Tools
search
Search for similar content using semantic search.
{
"query": "machine learning algorithms",
"top_k": 5,
"filter": {
"doc_ids": ["doc_123", "doc_456"]
}
}
upsert_documents
Insert or update documents programmatically.
{
"docs": [
{
"text": "Your document content here...",
"external_id": "my_doc_1",
"title": "Important Notes",
"metadata": {}
}
]
}
delete_documents
Delete documents by ID.
{
"doc_ids": ["doc_123"],
"external_ids": ["my_doc_1"]
}
list_documents
List all documents with pagination.
{
"page": {
"limit": 20
}
}
MCP Resources
PocketMCP provides resource URIs for accessing specific chunks:
- Format:
mcp+doc://<doc_id>#<chunk_id>
- Returns: Complete chunk data including text, offsets, and metadata
⚙️ Configuration
Environment Variables
Variable | Default | Description |
---|---|---|
SQLITE_PATH |
./data/index.db |
Path to SQLite database file |
WATCH_DIR |
(none) | Directory to watch for file changes |
MODEL_ID |
Xenova/all-MiniLM-L6-v2 |
Hugging Face model for embeddings |
CHUNK_SIZE |
1000 |
Target chunk size in characters |
CHUNK_OVERLAP |
120 |
Overlap between chunks in characters |
PDF_MAX_PAGES |
300 |
Maximum pages to process in PDF files |
PDF_MIN_TEXT_CHARS |
500 |
Minimum text characters required in PDFs |
DOC_MAX_BYTES |
10000000 |
Maximum file size for DOCX files (10MB) |
DOCX_SPLIT_ON_HEADINGS |
false |
Split DOCX documents on headings (h1/h2) |
NODE_ENV |
development |
Environment mode |
VERBOSE_LOGGING |
false |
Enable detailed logs |
DEBUG_DOTENV |
false |
Enable dotenv debug output |
API_PORT |
5174 |
Web API server port |
API_BIND |
127.0.0.1 |
API server bind address |
TRANSPORT |
both |
MCP transport mode (stdio/http/both) |
HTTP_HOST |
0.0.0.0 |
HTTP server bind address |
HTTP_PORT |
8001 |
MCP server port |
LOG_LEVEL |
info |
Logging level (debug/info/warn/error) |
Available Scripts
Script | Description |
---|---|
pnpm dev |
Start web interface + API server for testing |
pnpm dev:mcp |
Start MCP server (both transports + file watching) |
pnpm build |
Build all components |
pnpm start |
Start production MCP server (both transports + file watching) |
pnpm setup |
Create .env from template |
pnpm clean |
Clean build artifacts and database |
Watch Directory Notes
WATCH_DIR
is optional - if not set, only manual document upserts work- Choose any directory -
./kb
is just a convention, use whatever makes sense - Supported files:
.md
,.txt
,.pdf
,.docx
(configurable via FileIngestManager options) - File filtering: Automatically ignores temp files,
.DS_Store
,node_modules
, etc. - Nested directories: Recursively watches all subdirectories
Document Processing Pipeline
PocketMCP uses a three-tier processing model:
Documents → Segments → Chunks
- Documents: Top-level files with metadata (type, size, status, etc.)
- Segments: Logical divisions within documents:
- PDF: One segment per page
- DOCX: One segment per document (or per heading section if enabled)
- Text/Markdown: Single segment per document
- Chunks: Text pieces optimized for embedding and search
Processing Status Tracking:
ok
: Successfully processed and indexedskipped
: File was skipped (encrypted, unsupported format)needs_ocr
: PDF requires OCR processing (not implemented)too_large
: File exceeds size/page limitserror
: Processing failed due to an error
Supported File Types
Currently supports:
- Markdown (
.md
) - Plain text (
.txt
) - PDF (
.pdf
) - Text-based PDFs only, no OCR support - DOCX (
.docx
) - Microsoft Word documents
PDF Processing Notes:
- Only text-based PDFs are supported (no OCR for scanned documents)
- PDFs with insufficient text content will be marked as
needs_ocr
and skipped - Encrypted or password-protected PDFs will be marked as
skipped
- Large PDFs exceeding the page limit will be marked as
too_large
DOCX Processing Notes:
- Supports modern Microsoft Word documents (.docx format only, not legacy .doc)
- Can optionally split documents by headings (configurable via
DOCX_SPLIT_ON_HEADINGS
) - Large files exceeding the size limit will be marked as
too_large
To add more file types, modify the supportedExtensions
in the FileIngestManager
configuration.
🛠️ Development
Project Structure
PocketMCP/ # Monorepo root
├── package.json # Workspace configuration
├── pnpm-workspace.yaml # pnpm workspace setup
├── .env # Environment variables
├── .env.sample # Environment template
├── apps/
│ ├── api/ # Express API server
│ │ ├── src/
│ │ │ ├── server.ts # Main API server
│ │ │ └── db.ts # Database manager
│ │ └── package.json
│ └── web/ # React + Vite frontend
│ ├── src/
│ │ ├── App.tsx # Main app component
│ │ ├── store.ts # Zustand state management
│ │ ├── api.ts # API client
│ │ └── components/ # UI components
│ └── package.json
├── src/ # Original MCP server
│ ├── server.ts # MCP server and main entry point
│ ├── db.ts # SQLite database with sqlite-vec
│ ├── embeddings.ts # Transformers.js embedding pipeline
│ ├── chunker.ts # Text chunking with sentence awareness
│ ├── ingest.ts # Generic document ingestion
│ ├── file-ingest.ts # File-specific ingestion logic
│ └── watcher.ts # File system watcher with debouncing
├── data/ # SQLite database storage
├── kb/ # Default watch directory (configurable)
└── README.md
Development Commands
# Install dependencies
pnpm install
# Run MCP server in development mode (hot reload)
pnpm dev:mcp
# Run web tester in development mode
pnpm dev
# Build for production
pnpm build
# Run production build
pnpm start
# Run with custom environment
WATCH_DIR=./my-docs CHUNK_SIZE=500 pnpm dev:mcp
Testing
# Test web tester functionality
./test-web-tester.sh
# Manual API testing
curl http://127.0.0.1:5174/health
curl http://127.0.0.1:5174/api/db/diag
🚀 Production Deployment
Docker Deployment (Recommended)
PocketMCP is containerized and ready for production deployment with Docker and Portainer. The Docker setup runs all components together in a single container:
🏗️ Multi-Service Architecture:
- MCP Server (port 8001): HTTP transport for MCP protocol
- API Server (port 5174): Database operations and diagnostics
- Web UI (port 5173): Interactive web interface for testing and management
- Combined Health Check: Monitors all services via
/health
endpoint
Quick Start with Docker
# Pull the latest image
docker pull ghcr.io/kailash-sankar/pocketmcp:latest
# Run with all services (MCP + API + Web UI)
docker run -d \
--name pocketmcp \
--restart unless-stopped \
-p 8001:8001 \
-p 5174:5174 \
-p 5173:5173 \
-v pocketmcp_data:/app/data \
-v pocketmcp_kb:/app/kb \
-v pocketmcp_cache:/app/.cache \
ghcr.io/kailash-sankar/pocketmcp:latest
Access Points:
- 🔧 MCP Server:
http://localhost:8001
(HTTP transport + health check) - 📊 API Server:
http://localhost:5174
(Database API + diagnostics) - 🌐 Web UI:
http://localhost:5173
(Interactive web interface) - ❤️ Health Check:
http://localhost:5173/health
(Combined service status)
Docker Compose
# Clone the repository
git clone https://github.com/kailash-sankar/PocketMCP.git
cd PocketMCP
# Copy and customize environment file
cp .env.sample .env
# Start with Docker Compose
docker-compose up -d
# View logs
docker-compose logs -f pocketmcp
# Stop
docker-compose down
Docker Operations
Building Images
Local Build:
# Build for current platform
docker build -t pocketmcp:local .
# Build multi-arch (requires buildx)
docker buildx build --platform linux/amd64,linux/arm64 -t pocketmcp:multi-arch .
GitHub Actions Build:
# Tag and push to trigger automated build
git tag v1.0.0
git push origin v1.0.0
# Manual trigger via GitHub Actions UI
# Go to Actions → Build and Release Docker Images → Run workflow
Image Registry
GitHub Container Registry (GHCR):
# Login to GHCR
echo $GITHUB_TOKEN | docker login ghcr.io -u kailash-sankar --password-stdin
# Tag for GHCR
docker tag pocketmcp:local ghcr.io/kailash-sankar/pocketmcp:v1.0.0
# Push to GHCR
docker push ghcr.io/kailash-sankar/pocketmcp:v1.0.0
Tag Strategy
Tag Pattern | Purpose | Example | Recommended For |
---|---|---|---|
vX.Y.Z |
Exact version | v1.2.3 |
Production pinning |
vX.Y |
Minor stream | v1.2 |
Auto-updates within minor |
vX |
Major stream | v1 |
Auto-updates within major |
latest |
Latest release | latest |
Development/testing |
main-SHA |
Commit-based | main-a1b2c3d |
CI/CD pipelines |
Portainer Tag Selection:
- Stable Production: Use exact version tags (
v1.2.3
) - Auto-Updates: Use minor version tags (
v1.2
) for automatic patch updates - Development: Use
latest
for newest features
Volume Management
Critical Volumes:
# Database persistence (CRITICAL - contains all your data)
/app/data → SQLite database, must be backed up
# Knowledge base (your documents)
/app/kb → Source documents, can be repopulated
# Model cache (performance optimization)
/app/.cache → Downloaded models, can be recreated
Backup Strategy:
# Backup database
docker cp pocketmcp:/app/data/index.db ./backup-$(date +%Y%m%d).db
# Backup with Docker Compose
docker-compose exec pocketmcp cp /app/data/index.db /app/data/backup-$(date +%Y%m%d).db
Portainer Setup
Container Creation
Option 1: Portainer Stacks (Recommended)
- Go to Stacks → Add stack
- Name:
pocketmcp
- Paste this docker-compose content:
version: '3.8'
services:
pocketmcp:
image: ghcr.io/kailash-sankar/pocketmcp:v1.0
container_name: pocketmcp
restart: unless-stopped
ports:
- "8001:8001" # MCP Server
- "5174:5174" # API Server
- "5173:5173" # Web UI
volumes:
- pocketmcp_data:/app/data
- pocketmcp_kb:/app/kb
- pocketmcp_cache:/app/.cache
environment:
- NODE_ENV=production
- TRANSPORT=both
- HTTP_HOST=0.0.0.0
- HTTP_PORT=8001
- API_PORT=5174
- WEB_PORT=5173
- LOG_LEVEL=info
- SQLITE_PATH=/app/data/index.db
- WATCH_DIR=/app/kb
- MODEL_ID=Xenova/all-MiniLM-L6-v2
- CHUNK_SIZE=1000
- CHUNK_OVERLAP=120
- VERBOSE_LOGGING=false
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:5173/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
volumes:
pocketmcp_data:
pocketmcp_kb:
pocketmcp_cache:
Option 2: Individual Container
- Go to Containers → Add container
- Fill in the configuration:
Setting | Value |
---|---|
Name | pocketmcp |
Image | ghcr.io/kailash-sankar/pocketmcp:v1.0 |
Port mapping | 8001:8001, 5174:5174, 5173:5173 |
Restart policy | Unless stopped |
Environment Variables
Variable | Default | Description |
---|---|---|
NODE_ENV |
production |
Runtime environment |
TRANSPORT |
both |
MCP transport mode (stdio /http /both ) |
HTTP_HOST |
0.0.0.0 |
HTTP server bind address |
HTTP_PORT |
8001 |
MCP server port |
API_PORT |
5174 |
Web API server port |
WEB_PORT |
5173 |
Web UI server port |
LOG_LEVEL |
info |
Logging level (debug /info /warn /error ) |
SQLITE_PATH |
/app/data/index.db |
Database file path |
WATCH_DIR |
/app/kb |
Directory to watch for changes |
MODEL_ID |
Xenova/all-MiniLM-L6-v2 |
Hugging Face embedding model |
CHUNK_SIZE |
1000 |
Text chunk size in characters |
CHUNK_OVERLAP |
120 |
Overlap between chunks |
MAX_CONCURRENT_FILES |
5 |
Max files processed simultaneously |
VERBOSE_LOGGING |
false |
Enable detailed logging |
HF_TOKEN |
(optional) | Hugging Face API token |
HF_CACHE_DIR |
/app/.cache |
Model cache directory |
Volume Mappings
Container Path | Purpose | Host Path Example | Required |
---|---|---|---|
/app/data |
Database storage | /opt/pocketmcp/data |
Yes |
/app/kb |
Knowledge base | /opt/pocketmcp/kb |
Yes |
/app/.cache |
Model cache | /opt/pocketmcp/cache |
Recommended |
Volume Setup in Portainer:
- Volumes tab → Add volume
- Create three volumes:
pocketmcp_data
(database)pocketmcp_kb
(documents)pocketmcp_cache
(models)
Port Configuration
Host Port | Container Port | Protocol | Purpose |
---|---|---|---|
8001 |
8001 |
TCP | MCP Server (HTTP transport) |
5174 |
5174 |
TCP | Web API Server (database operations) |
5173 |
5173 |
TCP | Web UI Server (interactive interface) |
Health Check Configuration
The container includes a built-in health check that monitors all services via the combined /health
endpoint:
- Test Command:
curl -f http://localhost:5173/health
- Monitors: MCP Server, API Server, and Web UI
- Interval: 30 seconds
- Timeout: 10 seconds
- Start Period: 60 seconds (allows model download)
- Retries: 3
Health Response Format:
{
"status": "ok",
"timestamp": "2024-01-01T00:00:00.000Z",
"services": {
"mcp": "ok",
"api": "ok",
"web": "ok"
}
}
Resource Limits (Recommended)
Resource | Limit | Reservation |
---|---|---|
Memory | 2GB | 512MB |
CPU | 1.0 cores | 0.25 cores |
Upgrading
Rolling Updates:
- Stacks: Edit stack → Change image tag → Update the stack
- Containers: Recreate container with new image tag
Version Pinning vs Auto-Updates:
- Pin to exact version:
ghcr.io/kailash-sankar/pocketmcp:v1.2.3
- Auto-update patches:
ghcr.io/kailash-sankar/pocketmcp:v1.2
- Auto-update minor:
ghcr.io/kailash-sankar/pocketmcp:v1
Troubleshooting
Container won't start:
- Check logs in Portainer: Containers → pocketmcp → Logs
- Verify volume permissions
- Ensure ports aren't conflicting
Health check failing:
- Wait 60+ seconds for initial model download
- Check individual services:
- MCP Server:
curl http://HOST_IP:8001/health
- API Server:
curl http://HOST_IP:5174/health
- Web UI:
curl http://HOST_IP:5173/health
- MCP Server:
- Review container logs for specific service errors
Performance issues:
- Increase memory limit if model loading fails
- Reduce
CHUNK_SIZE
for lower memory usage - Check disk space for volumes
systemd Service (Linux)
Create /etc/systemd/system/pocketmcp.service
:
[Unit]
Description=PocketMCP Server
After=network.target
[Service]
Type=simple
User=pocketmcp
WorkingDirectory=/opt/pocketmcp
Environment=NODE_ENV=production
Environment=TRANSPORT=both
Environment=HTTP_HOST=0.0.0.0
Environment=HTTP_PORT=8001
Environment=SQLITE_PATH=/opt/pocketmcp/data/index.db
Environment=WATCH_DIR=/opt/pocketmcp/kb
ExecStart=/usr/bin/node dist/cli.js
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
Enable and start:
sudo systemctl enable pocketmcp
sudo systemctl start pocketmcp
sudo systemctl status pocketmcp
Docker Deployment
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install --production
COPY dist/ ./dist/
COPY data/ ./data/
COPY kb/ ./kb/
EXPOSE 8000
ENV NODE_ENV=production
ENV TRANSPORT=both
ENV HTTP_HOST=0.0.0.0
ENV HTTP_PORT=8001
CMD ["node", "dist/cli.js"]
Health Monitoring
# Health check endpoint
curl http://localhost:8001/health
# Expected response
{"status":"ok","timestamp":"2024-01-01T00:00:00.000Z"}
# Log monitoring
journalctl -u pocketmcp -f # systemd
pm2 logs pocketmcp # PM2
🏗️ Architecture
flowchart TD
subgraph "MCP Clients"
A[VS Code]
B[Cursor]
end
subgraph "Web Interface"
W1[React Frontend<br/>:5173]
W2[Express API<br/>:5174]
end
subgraph "PocketMCP Server"
C[MCP Server<br/>stdio transport]
D[File Watcher<br/>chokidar]
E[Text Chunker<br/>~1000 chars]
F[Embeddings<br/>Transformers.js<br/>MiniLM-L6-v2]
G[SQLite + sqlite-vec<br/>Vector Database]
end
subgraph "File System"
H[Watch Directory<br/>./kb/]
I[Data Directory<br/>./data/]
end
A -.->|MCP Tools| C
B -.->|MCP Tools| C
W1 -->|HTTP API| W2
W2 -->|Database Access| G
C --> D
D -->|File Changes| E
E -->|Text Chunks| F
F -->|384-dim Vectors| G
G -.->|Search Results| C
D -.->|Monitors| H
G -.->|Stores in| I
classDef mcpClient fill:#e1f5fe
classDef webInterface fill:#fff3e0
classDef server fill:#f3e5f5
classDef storage fill:#e8f5e8
class A,B mcpClient
class W1,W2 webInterface
class C,D,E,F,G server
class H,I storage
📊 Performance & Limits
- Sweet spot: 10K-100K chunks on modest hardware
- Query latency: Sub-100ms for
top_k <= 10
on typical corpora - Memory usage: ~100MB for model + minimal overhead per document
- Concurrency: Limited to 3 simultaneous file operations by default
- File size limit: 50MB per file (configurable)
🔧 Troubleshooting
Model Download Issues
If the embedding model fails to download:
- Check internet connection for initial download
- Model cache location:
~/.cache/huggingface/transformers/
- Clear cache and retry if needed
SQLite Extension Issues
If sqlite-vec
fails to load:
- Ensure
sqlite-vec
npm package is installed - Check that your system supports the required SQLite version
- The system automatically falls back to regular SQLite tables if vec0 virtual tables fail
File Watching Issues
- Files not being detected: Check file extensions and ignore patterns
- High CPU usage: Increase debounce time with larger
debounceMs
values - Permission errors: Ensure read/write access to watch and data directories
Web Interface Issues
- API not accessible: Ensure API server is running on port 5174
- Database not found: Check
SQLITE_PATH
environment variable - CORS errors: API server includes CORS headers for local development
Memory Issues
- Reduce
CHUNK_SIZE
for lower memory usage - Process fewer files simultaneously by reducing
maxConcurrency
- Consider using a smaller embedding model (though this requires code changes)
Common Error Messages
"Too many parameter values were provided"
- This was a known issue with sqlite-vec virtual tables, now fixed with automatic fallback
"Failed to load sqlite-vec extension"
- System automatically falls back to regular SQLite tables with JSON embeddings
"Database file does not exist"
- Run the MCP server first to create the database, or check the
SQLITE_PATH
✅ Docker Deployment Verification
Use this checklist to verify your Docker deployment is working correctly:
Pre-Deployment Checklist
- [ ] Docker and Docker Compose installed
- [ ] GitHub Container Registry access configured (if using GHCR)
- [ ] Sufficient disk space for volumes (minimum 2GB recommended)
- [ ] Port 8001 available on host system
Build Verification
- [ ] Multi-arch image builds successfully for both linux/amd64 and linux/arm64
- [ ] GitHub Actions workflow completes without errors
- [ ] Image is published to container registry
Runtime Verification
- [ ] Container starts successfully
- [ ] Health check endpoint returns
{"status":"ok"}
athttp://HOST:8001/health
- [ ] Model files download and cache in
/app/.cache
volume - [ ] Database initializes in
/app/data
volume - [ ] File watching works when documents added to
/app/kb
volume
Data Persistence Verification
- [ ] Database persists across container restarts
- [ ] Knowledge base files persist across container restarts
- [ ] Model cache persists across container restarts (improves startup time)
Portainer Integration Verification
- [ ] Container shows "healthy" status in Portainer
- [ ] Logs are accessible through Portainer interface
- [ ] Volume management works through Portainer
- [ ] Container can be upgraded by changing image tag
Upgrade Verification
- [ ] Upgrading from
vX.Y.Z
tovX.Y.(Z+1)
preserves all data - [ ] Upgrading preserves knowledge base documents
- [ ] Health check passes after upgrade
- [ ] MCP functionality works after upgrade
Performance Verification
- [ ] Memory usage stays within configured limits (default: 2GB max)
- [ ] CPU usage is reasonable during file processing
- [ ] Search queries respond within acceptable time (typically <100ms)
- [ ] File ingestion completes without timeout errors
📄 License
MIT License - see LICENSE file for details.
🤝 Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
🙏 Acknowledgments
- sqlite-vec for fast vector similarity search
- Transformers.js for local embedding generation
- Model Context Protocol for standardized tool integration
- Hugging Face for the MiniLM model
- React + Vite for the modern web interface
- TailwindCSS for beautiful, responsive styling
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.