Scraper Maintenance MCP
MCP server for automating web scraper maintenance via browser inspection, selector generation, and code updates.
README
Scraper Maintenance MCP
A comprehensive Model Context Protocol (MCP) server for automating web scraper maintenance through intelligent browser inspection, selector generation, and code updates.
š Project Structure
mcp/
āāā src/ # TypeScript source files
ā āāā server.ts # Main MCP server implementation
ā āāā browser-manager.ts # Browser automation and management
ā āāā selector-generator.ts # Selector generation and scoring
ā āāā types.ts # Type definitions
āāā dist/ # Compiled JavaScript files
ā āāā server.js # Main MCP server (executable)
ā āāā browser-manager.js # Browser automation
ā āāā selector-generator.js # Selector intelligence
ā āāā types.js # Type definitions
āāā config/ # Configuration files
ā āāā test-config.json # Test configuration
ā āāā claude-desktop-config.json # Claude Desktop setup
ā āāā *.json # Various scraper configurations
āāā examples/ # Usage examples and documentation
āāā docs/ # Documentation files
āāā scripts/ # Build and utility scripts
āāā package.json # Project configuration
āāā tsconfig.json # TypeScript configuration
š Quick Start
1. Install Dependencies
cd mcp
npm install
2. Build the Project
npm run build
3. Run the Server
npm start
š ļø Available MCP Tools
Configuration Management
load_scraper_config- Load scraper configuration filesupdate_config- Update configurations with new selector mappings
Browser Operations
initialize_browser- Launch browser (headless/visible mode)navigate_to_page- Navigate to target URLstake_screenshot- Capture debugging screenshotsclose_browser- Cleanup browser resources
Element Inspection
inspect_field_manually- Interactive visual element selectionauto_detect_field- AI-powered automatic element detectionvalidate_selectors- Test selector reliability and performancegenerate_selectors- Create multiple selector variations with scoringtest_extraction- Test data extraction using current selectors
Maintenance & Code Generation
run_maintenance_check- Comprehensive scraper health analysisgenerate_extractor_code- Multi-language code generation
š Usage
For Claude Desktop
Add to your Claude Desktop configuration:
{
"mcpServers": {
"scraper-maintenance": {
"command": "node",
"args": ["/path/to/mcp/dist/server.js"],
"env": {
"NODE_ENV": "production"
}
}
}
}
For Cursor
Add to your Cursor MCP configuration:
{
"mcpServers": {
"scraper-maintenance": {
"command": "node",
"args": ["/path/to/mcp/dist/server.js"],
"env": {
"NODE_ENV": "production"
}
}
}
}
š§ Development
Build
npm run build
Development Mode
npm run dev
Test
npm test
š Documentation
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.