MCP Servers

mcp-ai-detection

Enables multi-tier AI-detection screening on academic papers by extracting text from .tex and .docx files, splitting into standard sections, and running a pipeline of statistical and LLM-based analysis.

README

mcp-ai-detection

Open-source MIT MCP server for multi-tier AI-detection screening on academic papers. It accepts .tex and .docx, extracts clean text, splits standard paper sections, and runs a three-tier risk pipeline.

AI detection is screening, not proof. Reports include limits, threats to validity, and a final recommendation framed as decision support.

Features

MCP tools: extract_text, split_sections, full_pipeline
Input: LaTeX .tex and Word .docx
Text extraction: Pandoc for LaTeX when installed, robust fallback cleaner, python-docx for Word
Narrative/structured split: tables, formulas, captions, references, keyword lines, markdown tables, and dense math lines are excluded from the main authorship score
Section splitting: Abstract, Introduction, Methods, Results, Discussion, Conclusion
Tier 1 offline: burstiness, lexical diversity, AI-like connectives, n-gram repetition, sentence-length variance, repeated patterns, hedging, example density
Optional Tier 1 local LLM through Ollama with gemma4:e4b by default
Tier 2 local Gemma adjudicator through Ollama: rubric-based JSON screening calibrated with Tier 1 metrics, no paid API keys
Tier 3 open-source ensemble hooks: DetectGPT, Fast-DetectGPT, NPR command adapters plus built-in proxy analysis for repetition, lexical diversity, and semantic coherence
JSON and Markdown reports with executive summary, section breakdown, section x tier score table, narrative score, structured-content diagnostic, limits, and recommendation

Install

python -m pip install -e .

Pandoc is optional but recommended for LaTeX:

# macOS
brew install pandoc

# Ubuntu/Debian
sudo apt-get install pandoc

MCP server

Run with stdio transport:

python -m mcp_ai_detection.server

Example MCP client config:

{
  "mcpServers": {
    "ai-detection": {
      "command": "python",
      "args": ["-m", "mcp_ai_detection.server"],
      "env": {
        "LOCAL_LLM_MODEL": "gemma4:e4b"
      }
    }
  }
}

Tools

`extract_text`

{
  "file_path": "paper.tex",
  "prefer_pandoc": true
}

Returns clean text, word count, extractor used, and warnings.

`split_sections`

{
  "text": "Abstract\n...\nIntroduction\n..."
}

Returns detected standard sections with line ranges and word counts.

`full_pipeline`

{
  "file_path": "paper.docx",
  "use_llm": false,
  "tier2_provider": "gemma-local",
  "early_stop": true
}

Runs extraction, sectioning, Tier 1 statistics, conditional Tier 2 Gemma/Ollama, conditional Tier 3, then returns report_json and report_markdown.

CLI

python -m mcp_ai_detection.cli paper.tex --markdown report.md --json report.json

Configuration

Environment variables:

LOCAL_LLM_MODEL=gemma4:e4b
OLLAMA_HOST=http://localhost:11434
OLLAMA_KEEP_ALIVE=30m
HTTP_TIMEOUT_SECONDS=120
TIER1_LLM_WEIGHT=0.6
TIER1_STATS_WEIGHT=0.4

DETECTGPT_CMD=
FAST_DETECTGPT_CMD=
NPR_CMD=
METHODS_WEIGHT_REDUCTION=0.75

Tier 2 uses the local Ollama model named by LOCAL_LLM_MODEL. Recommended:

ollama pull gemma4:e4b
ollama serve

Check that Ollama is using the GPU:

ollama ps

The PROCESSOR column should show 100% GPU for loaded models.

External Tier 3 commands receive section text on stdin and should return JSON:

{
  "score": 0.72,
  "confidence": 0.64,
  "details": {
    "model": "your-detector"
  }
}

If commands are not configured, built-in proxy scorers keep the pipeline fully offline and deterministic.

Thresholds

< 0.3: low
0.3-0.6: medium
>= 0.6: high
Tier 2 early stop: probability < 0.4
Sections below 80 narrative words are marked insufficient_evidence and are excluded from the document-level narrative score

Methods sections get reduced Tier 3 weight by default to lower false positives from formulaic scientific prose.

Development

Run offline tests:

python -m unittest discover -s tests

Run lint if dev extras are installed:

ruff check .

gemma3:4b is a smaller fallback for slower machines:

LOCAL_LLM_MODEL=gemma3:4b

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured