TradeMCP

TradeMCP

An MCP server that makes trade documents machine readable using Docling for local, deterministic document extraction and search, with no AI required.

Category
Visit Server

README

<div align="center"> <img src="logo.svg" alt="TradeMCP Logo" width="200" />

TradeMCP

Make trade documents machine readable.

Open-source MCP server for document workflow simplification. </div>

--- Built on IBM's Docling for powerful document extraction.

šŸš€ Works Out of the Box

No AI needed for document processing. No API keys. No cloud dependencies. Works with any AI platform.

This MCP server runs 100% locally using Docling for document extraction - no AI required for the actual processing. Connect it to any MCP-compatible AI assistant:

  • Claude Desktop by Anthropic
  • Microsoft Copilot via Copilot Studio
  • ChatGPT with MCP support
  • Any MCP-compatible client (growing ecosystem)

šŸ”Œ Vendor & Model Agnostic

One engine, any AI platform.

The Model Context Protocol (MCP) is an open standard. This means:

  • āœ… Not locked to Anthropic or Claude
  • āœ… Works with Microsoft Copilot Studio
  • āœ… Compatible with any MCP implementation
  • āœ… Future-proof as more platforms adopt MCP
  • āœ… Use with GPT, Gemini, Llama, or any model

Your document infrastructure shouldn't depend on a single AI vendor. TradeMCP ensures it doesn't.

šŸ—ļø The Engine, Not the Brain

TradeMCP is the engine that enables document operations:

  • Powered by Docling: IBM Research's document parser (no AI needed)
  • Deterministic Processing: Same document = same output every time
  • MCP Native: Works with any MCP-compatible client
  • Zero Configuration: Install and run — no setup required
  • 100% Local: Your documents never leave your machine

The brain (workflow intelligence, trade expertise, compliance logic) can come from any AI model or commercial solution - but the engine runs without any AI.

šŸ—ļø Modular Architecture

All components are modular and replaceable. Docling can be replaced with domain-specific tools or services tailored to your exact document processing needs.

šŸ“¦ Installation & Setup

Prerequisites

  1. Install dependencies:

    pip install -r requirements.txt
    
  2. Download the local AI model (first time only):

    python download_models.py
    

    This downloads a small, efficient AI model that runs 100% locally on your machine.

🧠 About the Local AI Model

What is it?

  • Model: sentence-transformers/all-MiniLM-L6-v2
  • Size: ~87MB (small and efficient)
  • Type: Local embedding model - runs entirely on your CPU/GPU
  • Privacy: 100% local - no data sent to external servers
  • No API keys: No OpenAI, Anthropic, or cloud service needed

What does it do? This local model provides intelligent document understanding:

  • Semantic Search: Find documents by meaning, not just keywords
  • Document Similarity: Identify related trade documents automatically
  • Smart Categorization: Automatically group similar documents
  • Context Understanding: Understand relationships between different parts of documents

Why a local model?

  • āœ… Complete Privacy: Your sensitive trade documents never leave your machine
  • āœ… No API Costs: No usage fees or rate limits
  • āœ… Offline Operation: Works without internet connection
  • āœ… Fast Processing: No network latency, instant results
  • āœ… Predictable Performance: Same results every time, no service degradation

Note on Model Storage

The model files are stored in model_cache/ (excluded from git to keep the repository lightweight). They persist between sessions - you only download once.

šŸ“š For technical details about the model, see MODEL_INFO.md

Docling by IBM Research (Default Parser)

This project leverages Docling, IBM's advanced document conversion technology:

  • Rule-based extraction - No AI/ML required
  • Extracts text, tables, and structure from PDFs, DOCX, XLSX, PPTX, and more
  • Maintains document layout and formatting intelligence
  • Handles complex multi-column layouts and embedded tables
  • Open-source (MIT licensed) and actively maintained
  • Easily replaceable with custom parsers for specific document types

Model Context Protocol (MCP)

Open standard for AI-tool communication:

  • Works with Claude Desktop (Anthropic)
  • Compatible with Copilot Studio (Microsoft)
  • Supports ChatGPT with MCP integration
  • Supports any MCP client implementation
  • Vendor-neutral protocol specification

šŸŽÆ What This Is (And Isn't)

This IS:

  • āœ… A vendor-agnostic MCP server with 14 document tools
  • āœ… Deterministic document extraction via Docling (no AI)
  • āœ… Full-text and semantic search capabilities
  • āœ… Production-ready document processing engine
  • āœ… 100% local, offline-capable, no external dependencies

This IS NOT:

  • āŒ An AI-powered document processor (it's deterministic)
  • āŒ Tied to any specific AI vendor
  • āŒ An AI model (it's infrastructure for any AI)
  • āŒ A complete workflow automation solution
  • āŒ The commercial Kansofy product

⚔ Quick Start

With Claude Desktop

# Add to ~/Library/Application Support/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "trademcp": {
      "command": "python",
      "args": ["/path/to/trademcp/mcp_server.py"]
    }
  }
}

With Microsoft Copilot Studio

# Configure in Copilot Studio as external tool
# Point to the MCP server endpoint
# Use the standard MCP protocol

With ChatGPT (MCP Support)

# Connect via MCP-compatible ChatGPT clients
# Point to the same MCP server endpoint
# Standard MCP protocol compatibility

With Any MCP Client

# Start the MCP server
python mcp_server.py

# Connect any MCP-compatible client
# Server speaks standard MCP protocol

šŸ› ļø What You Get

Document Processing (No AI Required)

# Docling extracts everything deterministically
upload_document("complex_invoice.pdf")
# āœ“ Text extracted (rule-based)
# āœ“ Tables preserved (pattern matching)
# āœ“ Structure maintained (document parsing)
# āœ“ Same input = same output every time

Intelligent Search (Still No AI)

# Full-text search with SQLite FTS5
search_documents("payment terms net 30")

# Semantic similarity with pre-computed embeddings
vector_search("documents about shipping delays")

# Find duplicates using hashing
find_duplicates()

MCP Tools for Any AI Assistant

All 14 tools work instantly with any MCP client:

  • upload_document - Process any document format (no AI)
  • search_documents - Lightning-fast full-text search (SQL)
  • vector_search - Find similar documents (embeddings)
  • get_document_tables - Extract tables from PDFs (Docling)
  • [... and 10 more tools]

🧠 The Brain Lives Elsewhere

This engine provides the infrastructure (no AI). The intelligence comes from:

Your AI Assistant (Claude, Copilot, etc.)

The AI provides the intelligence to:

  • Understand your intent
  • Orchestrate document operations
  • Make decisions based on content
  • Generate insights and summaries

Your Own Implementation

Build your own workflows on top:

  • Custom document classification
  • Business rule validation
  • Workflow orchestration
  • Integration patterns

Commercial Solutions

Production-ready intelligence:

  • Kansofy Trade Cloud: Full SaaS with trade workflows
  • Kansofy Enterprise: Self-hosted with compliance engine
  • Professional Services: Custom workflow development

šŸ“Š How It Works Without AI

Document Processing Pipeline

PDF/DOCX → Docling (rule-based) → Structured Data → SQLite
         ↓
    No AI needed
    Deterministic
    100% reproducible

Search Pipeline

Query → FTS5 (SQL) → Results
      → Embeddings (pre-computed) → Similarity
      
No AI inference at search time

🌐 Platform Compatibility

Platform Status Configuration
Claude Desktop āœ… Tested Native support
Microsoft Copilot āœ… Compatible Via Copilot Studio
ChatGPT āœ… Compatible MCP integration
OpenAI GPTs šŸ”„ Planned MCP bridge needed
Google Gemini šŸ”„ Planned MCP adapter
Open Source LLMs āœ… Ready Any MCP client

šŸ¤ Why This Architecture Matters

No AI in the Engine Means:

  • Deterministic results (same input = same output)
  • No API costs for document processing
  • Works offline completely
  • No rate limits or quotas
  • Full data privacy (nothing leaves your machine)
  • Predictable performance

Any AI for the Brain Means:

  • Choose your preferred AI assistant
  • Switch providers without changing infrastructure
  • Use multiple AIs for different tasks
  • Future-proof as AI landscape evolves

šŸ“ˆ When You Need More

You'll know it's time for commercial solutions when:

  • [ ] Processing >100 documents daily
  • [ ] Need trade-specific workflows
  • [ ] Require compliance validation
  • [ ] Want pre-built intelligence
  • [ ] ROI justifies enterprise features

šŸ”— Technical Foundation

  • Document Processing: Docling by IBM Research (no AI)
  • Protocol: Model Context Protocol (open standard)
  • Search: SQLite with FTS5 extension (deterministic SQL)
  • Embeddings: Sentence-transformers (pre-computed, no inference)
  • Server: FastAPI + Python 3.9+ (standard web framework)

šŸ“š Documentation

Guide Description
Installation Complete setup guide
MCP Tools All 14 tools documented
Architecture System design & components
Usage Guide Examples and workflows
Platform Compatibility Multi-platform setup
Troubleshooting Common issues and solutions
Contributing How to contribute

šŸ™ Acknowledgements


Built with ā¤ļø for the humans running global trade.
Making trade documents machine readable. The foundation for intelligent trade workflows.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured