LeadsClean MCP Server

LeadsClean MCP Server

Sales Intelligence · B2B Lead Extraction An MCP (Model Context Protocol) server that gives AI agents structured B2B lead intelligence extracted directly from company websites. Point it at any URL and get back a clean JSON object — company summary, buying signals, inferred needs, and personalised icebreaker lines — ready to drop into your outreach pipeline. Built for agent pipelines. Works with Cl

Category
Visit Server

README

LeadsClean MCP Server

Glama MCP Server

An open-source MCP server that extracts structured B2B lead intelligence from company websites. Point it at any URL — get back a clean JSON object with company summary, buying signals, inferred needs, and personalised icebreaker lines.

Built as a reference implementation for MCP tool development. Demonstrates multi-provider LLM routing, dual-transport MCP serving, GDPR compliance patterns, and API key management — patterns you can reuse in your own MCP servers.

Works with Claude Desktop, Cursor, and any MCP-compatible client.

<a href="https://glama.ai/mcp/servers/@edition/leads-clean-mcp-server"> <img width="380" height="200" src="https://glama.ai/mcp/servers/@edition/leads-clean-mcp-server/badge" alt="LeadsClean Server MCP server" /> </a>


Tools

Tool Description
extract_lead_intelligence Analyse a single company URL and return structured lead intel
batch_extract_leads Analyse up to 20 URLs in parallel — designed for agent list-processing

Output schema

{
  "company_name": "Acme Hotels Group",
  "core_business_summary": "Boutique hotel chain with 12 properties across Europe.",
  "product_category_match": "Strong match — hotel groups purchase furniture in bulk for room refits.",
  "recent_company_trigger": "Announced expansion to 3 new cities in Q1 2026, adding 400+ rooms.",
  "inferred_business_need": "Bulk furnishing for new hotel rooms on tight fit-out timelines.",
  "icebreaker_hook_business": "Running 12 properties across Europe is impressive — furnishing them at scale is where we help.",
  "icebreaker_hook_news": "Saw the Q1 expansion news — we help hotel groups source wholesale beds and sofas fast.",
  "data_provenance": {
    "source_url": "https://acmehotels.com",
    "source_type": "public_website",
    "collection_method": "jina_reader_public_fetch",
    "contains_pii": false,
    "gdpr_basis": "legitimate_interest",
    "gdpr_notes": "Extracted solely from publicly available company web pages. No personal data collected. Compliant with GDPR Art. 6(1)(f)."
  }
}

Every response includes data_provenance — a machine-readable GDPR metadata block indicating data source, PII status, and legal basis.


Quick start

Prerequisites

Install

pip install mcp-leadsclean

Or clone and install from source:

git clone https://github.com/edition/leadsclean
cd leadsclean
pip install -e .

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "leadsclean": {
      "command": "mcp-leadsclean",
      "env": {
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Set the key for whichever provider(s) you use (see Environment variables).

Cursor

Add to your Cursor MCP config (~/.cursor/mcp.json):

{
  "mcpServers": {
    "leadsclean": {
      "command": "mcp-leadsclean",
      "env": {
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

HTTP transport (production agent pipelines)

For remote agents or multi-tenant deployments, run with Streamable HTTP transport:

OPENAI_API_KEY=sk-... mcp-leadsclean --transport http --port 8001

The server exposes a single MCP endpoint at http://localhost:8001/mcp.


Demo mode

Try the server without an API key — useful for testing your agent pipeline or reviewing the output schema:

LEADSCLEAN_DEMO=1 mcp-leadsclean

All tool calls return a sanitised fixture response when LEADSCLEAN_DEMO=1 is set. The response includes "_demo": true so agents can detect and discard it.


Environment variables

The model parameter controls which provider is used. Provider is inferred from the model-name prefix — set the corresponding key:

Variable Required when Model prefix Description
OPENAI_API_KEY Using OpenAI (default) gpt-*, o1-*, o3-* OpenAI API key
ANTHROPIC_API_KEY Using Claude claude-* Anthropic API key
DASHSCOPE_API_KEY Using Alibaba Qwen qwen-* Alibaba DashScope API key
MINIMAX_API_KEY Using MiniMax abab*, minimax-* MiniMax API key
LEADSCLEAN_DEMO Set to 1 to return fixture data without any LLM call

The default model is gpt-4o-mini (OpenAI). To switch provider, pass the desired model ID in the tool call — e.g. claude-3-5-haiku-20241022 for Anthropic, qwen-turbo for Alibaba.


REST API

A standard FastAPI REST endpoint is also available for non-MCP integrations:

uvicorn main:app --reload
curl -X POST http://localhost:8000/extract-leads \
  -H "Content-Type: application/json" \
  -d '{
    "target_url": "https://acmecorp.com",
    "seller_context": "We provide cloud HR software to mid-size logistics companies."
  }'

Reusable patterns

This project demonstrates several patterns worth extracting for your own MCP servers:

Pattern Where What it does
Multi-provider LLM routing core.py Dispatches to OpenAI / Anthropic / Qwen / MiniMax based on model name prefix
Dual-transport MCP serving mcp_server.py Same tool logic served over stdio (local) and HTTP (remote)
SSRF protection core.py Validates URLs against private IP ranges before external fetch
Prompt injection mitigation core.py XML boundary tags around user-controlled content in LLM prompts
API key hashing db.py SHA-256 hashing with prefix display — keys are never stored in plain text
Usage metering db.py + auth.py Per-key monthly quotas with auto-reset and atomic increment
GDPR provenance core.py Machine-readable compliance metadata on every response
Demo mode core.py + auth.py Full bypass of external services for pipeline testing

Development

# Install dependencies
pip install -r requirements.txt

# Run MCP server (stdio)
python mcp_server.py

# Run MCP server (HTTP, port 8001)
python mcp_server.py --transport http

# Run REST API
uvicorn main:app --reload

How it works

  1. Fetch — retrieves clean Markdown from the target URL via Jina Reader
  2. Extract — passes the content to your chosen LLM (OpenAI, Anthropic Claude, Alibaba Qwen, or MiniMax) with a structured prompt
  3. Return — outputs a JSON object matching the schema above

Content never leaves the pipeline: no data is stored by LeadsClean.


Built with Claude

This project was developed with the assistance of Claude by Anthropic — an AI assistant used for code generation, architecture design, and documentation.


License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured