AnyDocs MCP Server

AnyDocs MCP Server

Transforms any website's documentation into an MCP-compatible interactive knowledge base with universal scraping, advanced search, and AI-powered tools. Supports GitBook, Notion, Confluence, and custom documentation platforms with real-time synchronization.

Category
Visit Server

README

AnyDocs MCP Server

Python Version MCP SDK License Code Style

Transform any website's documentation section into an MCP-compatible server using the Python MCP SDK.

🚀 Overview

AnyDocs MCP Server is a comprehensive solution that turns any website's documentation into an interactive, AI-accessible knowledge base through the Model Context Protocol (MCP). It can scrape, index, and serve documentation from any website - from modern API docs to legacy documentation portals.

Key Features

  • 🌐 Universal Website Scraping: Turn ANY website's documentation into an interactive knowledge base
  • 🔌 Universal Adapter System: Support for GitBook, Notion, Confluence, and custom documentation platforms
  • 🔍 Advanced Search: Full-text search with SQLite FTS and semantic search capabilities
  • 🔐 Robust Authentication: API Key, OAuth2, and JWT-based authentication
  • ⚡ High Performance: Async/await architecture with caching and rate limiting
  • 🎛️ Web Management Interface: FastAPI-based admin panel for configuration and monitoring
  • 📊 Real-time Monitoring: Health checks, metrics, and logging
  • 🐳 Docker Ready: Complete containerization with development and production configurations
  • 🔄 Auto-sync: Automatic content synchronization with source documentation

📋 Requirements

  • Python 3.11+ (recommended: 3.11 or 3.12)
  • SQLite 3.35+ (for FTS5 support)
  • Optional: Redis (for caching)
  • Optional: PostgreSQL/MySQL (for production)

🛠️ Installation

Using uvx (Recommended)

The easiest way to run AnyDocs MCP Server is using uvx, which automatically manages dependencies and virtual environments:

# Run directly with uvx (no installation needed)
uvx anydocs-mcp-server

# Run with custom configuration
uvx anydocs-mcp-server --config config.yaml

# Run in debug mode
uvx anydocs-mcp-server --debug

# Install globally with uvx for repeated use
uvx install anydocs-mcp-server

# Then run anytime with:
anydocs-mcp-server --config config.yaml

Quick Start

# Clone the repository
git clone https://github.com/funky1688/anydocs-mcp.git
cd anydocs-mcp

# Install dependencies using uv (recommended)
pip install uv
uv pip install -e .

# Copy environment configuration
cp .env.example .env

# Copy and customize configuration
cp config.yaml my-config.yaml

# Edit configuration (add your API keys and settings)
nano .env

# Start the server (hybrid mode - MCP + Web interface)
uv run python start.py

Manual Installation

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
uv pip install -e .  # for production
# OR for development:
uv pip install -e .[dev]

# Copy and configure environment
cp .env.example .env

# Edit configuration files
nano .env

# Initialize and start
uv run python start.py --mode hybrid --debug

Running as a Python Module

After installation, you can also run the server as a Python module:

# Run as module
python -m anydocs_mcp

# With configuration
python -m anydocs_mcp --config config.yaml

# Debug mode
python -m anydocs_mcp --debug

Docker Installation

# Development environment
docker-compose -f docker-compose.dev.yml up -d

# Production environment
docker-compose up -d

🚀 Usage

Starting the Server

AnyDocs MCP Server supports 3 startup modes:

1. Hybrid Mode (Default - Recommended)

Starts both MCP server and web management interface simultaneously:

uv run python start.py
# or explicitly:
uv run python start.py --mode hybrid
  • MCP Server: Available at http://localhost:8000 (handles MCP protocol communication)
  • Web Interface: Available at http://localhost:8080 for management
  • Best for: Most users who want both MCP functionality and web management

Important: Always use uv run to ensure the correct virtual environment is used.

2. MCP Server Only

Starts only the MCP server without web interface:

uv run python start.py --mode mcp
  • Use case: Production deployments where only MCP protocol is needed
  • Lighter resource usage: No web interface overhead
  • Best for: Headless servers, CI/CD environments

3. Web Interface Only

Starts only the web management interface:

uv run python start.py --mode web
  • Use case: Administrative tasks, configuration management
  • Web Interface: Available at http://localhost:8080
  • Best for: Configuration, monitoring, and testing without MCP protocol

Additional Options

# Debug mode with auto-reload
uv run python start.py --debug

# Custom configuration file
uv run python start.py --config custom-config.yaml

# Skip dependency check (faster startup)
uv run python start.py --no-deps-check

# Kill occupied ports before starting
uv run python start.py --kill-ports

# Skip database initialization
uv run python start.py --no-db-init

Command Line Options

uv run python start.py --help

# Available options:
#   --mode {mcp,web,hybrid}     Startup mode (default: hybrid)
#   --config CONFIG             Configuration file path (default: config.yaml)
#   --debug                     Enable debug mode
#   --no-deps-check            Skip dependency check
#   --no-db-init               Skip database initialization
#   --kill-ports               Kill occupied ports before starting

Web Management Interface

Access the web interface at http://localhost:8080 to:

  • Configure document sources
  • Manage users and API keys
  • Monitor system health
  • View logs and metrics
  • Test MCP endpoints

MCP Client Integration

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

# Connect to AnyDocs MCP Server
async with stdio_client(StdioServerParameters(
    command="python",
    args=["main.py"]
)) as (read, write):
    async with ClientSession(read, write) as session:
        # Initialize the connection
        await session.initialize()
        
        # List available tools
        tools = await session.list_tools()
        
        # Search documents
        result = await session.call_tool(
            "search_documents",
            arguments={"query": "authentication", "limit": 10}
        )

📚 Documentation Adapters

Supported Platforms

Platform Status Features
Any Website Universal scraper for any documentation site
GitBook Full API integration, real-time sync
Notion Database and page content, webhooks
Confluence Space and page management, attachments
GitHub Repository documentation, wikis
GitLab Project documentation, wikis
SharePoint Document libraries, lists
Slack Channel messages, knowledge base
File System Local markdown files, watch mode
Custom 🔧 Extensible adapter framework

Adding a New Adapter

from anydocs_mcp.adapters.base import BaseDocumentAdapter

class CustomAdapter(BaseDocumentAdapter):
    """Custom documentation adapter implementation."""
    
    async def fetch_documents(self) -> List[Document]:
        """Fetch documents from your platform."""
        # Implementation here
        pass
    
    async def get_document_content(self, doc_id: str) -> str:
        """Get specific document content."""
        # Implementation here
        pass

🔧 Configuration

Troubleshooting

Common Installation Issues

Python-Jose Import Error: If you encounter No module named 'jose' error:

# Always use uv run to ensure correct virtual environment
uv run python start.py

# If the issue persists, reinstall python-jose with cryptography extras
uv pip uninstall python-jose
uv pip install "python-jose[cryptography]"

Dependency Check Failures: If you see errors about missing pyyaml or beautifulsoup4:

# The dependency check has been fixed to use correct import names
# Ensure you're using the latest version
git pull origin main
uv pip install -e .

Configuration Attribute Errors: If you see 'AppConfig' object has no attribute 'server_host':

# This is fixed in the current version - ensure you have the latest code
git pull origin main

Virtual Environment Issues: If packages seem installed but imports fail:

# Always use 'uv run' to ensure correct environment
uv run python start.py

# Check if you're in the right environment
which python  # Should point to your project's Python
uv pip list    # Should show installed packages

Setup.py Conflicts: If you encounter conflicts with multiple setup files:

# The redundant root setup.py has been removed
# Use pyproject.toml for package management
uv pip install -e .

Environment Setup Best Practices

  1. Always use uv run for executing Python scripts to ensure correct environment
  2. Use uv pip install instead of uv install for package installation
  3. Check virtual environment with uv pip list if imports fail
  4. Pull latest changes if you encounter configuration issues

Port Conflicts

If ports 8000 or 8080 are occupied:

# Kill processes on required ports (Windows)
netstat -ano | findstr :8000
taskkill /F /PID <PID>

# Or use the built-in option
uv run python start.py --kill-ports

Environment Variables

# Server Configuration
ANYDOCS_HOST=localhost
ANYDOCS_PORT=8000
ANYDOCS_WEB_PORT=8080
ANYDOCS_DEBUG=false

# Database
DATABASE_URL=sqlite:///data/anydocs.db
# DATABASE_URL=postgresql://user:pass@localhost/anydocs

# Authentication
JWT_SECRET_KEY=your-secret-key
API_KEY_PREFIX=anydocs_

# Document Adapters
GITBOOK_API_TOKEN=your-gitbook-token
NOTION_API_TOKEN=your-notion-token
CONFLUENCE_API_TOKEN=your-confluence-token

# Cache (Optional)
REDIS_URL=redis://localhost:6379/0

# Monitoring
ENABLE_METRICS=true
LOG_LEVEL=INFO

YAML Configuration

# config.yaml
server:
  host: localhost
  port: 8000
  web_port: 8080
  debug: false

database:
  url: sqlite:///data/anydocs.db
  pool_size: 10
  echo: false

auth:
  jwt_secret: ${JWT_SECRET_KEY}
  token_expire_minutes: 1440
  
  api_key:
    prefix: anydocs_
    length: 32

adapters:
  gitbook:
    api_token: ${GITBOOK_API_TOKEN}
    base_url: https://api.gitbook.com
    rate_limit: 100
  
  notion:
    api_token: ${NOTION_API_TOKEN}
    version: "2022-06-28"
    rate_limit: 3

🔍 MCP Tools

AnyDocs MCP Server provides the following tools:

Core Tools

  • search_documents - Search documents with full-text and semantic search
  • get_document - Retrieve a specific document by ID
  • list_sources - List all configured document sources
  • summarize_content - Summarize document content
  • ask_question - Ask questions about document content

AI-Powered Tools

  • generate_documentation - AI-assisted documentation generation
  • translate_content - Multi-language content translation
  • extract_insights - Extract insights and analytics from documentation
  • suggest_improvements - AI-powered content enhancement suggestions

🧪 Development

Setup Development Environment

# Install development dependencies
uv pip install -e .[dev]

# Setup pre-commit hooks
uv run pre-commit install

# Run tests
uv run pytest

# Run with coverage
uv run pytest --cov=src/anydocs_mcp

# Code formatting
uv run black src/ tests/
uv run isort src/ tests/

# Type checking
uv run mypy src/

# Security checks
uv run bandit -r src/

Project Structure

anydocs-mcp/
├── src/anydocs_mcp/          # Main package
│   ├── adapters/             # Document adapters
│   ├── auth/                 # Authentication
│   ├── config/               # Configuration management
│   ├── content/              # Content processing
│   ├── database/             # Database models and operations
│   ├── utils/                # Utilities and helpers
│   ├── web/                  # Web interface
│   └── server.py             # MCP server implementation
├── tests/                    # Test suite
├── docs/                     # Documentation
├── scripts/                  # Utility scripts
│   └── setup.py              # Development environment setup
├── pyproject.toml            # Package configuration (modern Python packaging)
├── start.py                  # Main startup script
└── main.py                   # MCP server entry point

Note: The project uses pyproject.toml for package configuration following modern Python packaging standards. The redundant root setup.py has been removed to avoid conflicts.

Running Tests

# All tests
uv run pytest

# Unit tests only
uv run pytest tests/unit/

# Integration tests only
uv run pytest tests/integration/

# With coverage
uv run pytest --cov=src/anydocs_mcp --cov-report=html

# Performance tests
uv run pytest tests/performance/

📊 Monitoring & Observability

Health Checks

# Check service health
curl http://localhost:8080/health

# Detailed health check
curl http://localhost:8080/health/detailed

Metrics

Metrics are available at /metrics endpoint in Prometheus format:

  • Request count and duration
  • Database connection pool status
  • Document sync statistics
  • Error rates and types
  • Cache hit/miss ratios

Logging

Structured logging with configurable levels:

# Application logs
tail -f anydocs_mcp.log

# Check logs in real-time
uv run python start.py --debug

🐳 Docker Deployment

Development

# Start development environment
docker-compose -f docker-compose.dev.yml up -d

# View logs
docker-compose logs -f anydocs-mcp-dev

# Access shell
docker-compose exec anydocs-mcp-dev bash

Production

# Build and start production environment
docker-compose up -d

# Scale services
docker-compose up -d --scale anydocs-mcp=3

# Update services
docker-compose pull && docker-compose up -d

Monitoring Stack

# Start with monitoring
docker-compose --profile monitoring up -d

# Access services
# Grafana: http://localhost:3001 (admin/admin)
# Prometheus: http://localhost:9090

🔒 Security

Authentication Methods

  1. API Keys: Simple token-based authentication
  2. JWT Tokens: Stateless authentication with expiration
  3. OAuth2: Integration with external providers

Security Best Practices

  • All API endpoints require authentication
  • Rate limiting on all endpoints
  • Input validation and sanitization
  • SQL injection prevention
  • CORS configuration
  • Security headers
  • Audit logging

Security Scanning

# Run security checks
uv run bandit -r src/

# Dependency vulnerability scan
uv run safety check

# SAST scanning
uv run bandit -r src/

🚀 Performance

Optimization Features

  • Async/Await: Non-blocking I/O operations
  • Connection Pooling: Efficient database connections
  • Caching: Redis-based caching with TTL
  • Rate Limiting: Prevent API abuse
  • Batch Processing: Efficient bulk operations
  • Lazy Loading: On-demand content loading

Performance Monitoring

# Performance testing
uv run pytest tests/performance/

# Memory profiling
uv run python -m memory_profiler start.py

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Workflow

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Run the test suite
  6. Submit a pull request

Code Standards

  • Follow PEP 8 style guide
  • Use type hints
  • Write comprehensive tests
  • Document public APIs
  • Use meaningful commit messages

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

📞 Support


Made with ❤️ by funky1688

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured