Discover Awesome MCP Servers

Extend your agent with 15,422 capabilities via MCP servers.

All15,422
Fetch MCP Server

Fetch MCP Server

Okay, here's a breakdown of how you can fetch URLs from a webpage using Playwright, integrate it with an SSE (Server-Sent Events) MCP (Management Control Protocol) server, and use Node.js with Express.js to manage the whole process. I'll provide code snippets and explanations to guide you. **Conceptual Overview** 1. **Playwright (Web Scraping):** Playwright will be used to launch a browser, navigate to the target webpage, and extract the URLs you need. 2. **Node.js with Express.js (Server):** Express.js will create a web server that handles requests to start the scraping process and stream the results back to the client using Server-Sent Events (SSE). 3. **SSE (Server-Sent Events):** SSE is a one-way communication protocol where the server pushes updates to the client in real-time. This is ideal for streaming the URLs as they are discovered. 4. **MCP (Management Control Protocol):** The MCP part is a bit vague without more context. I'll assume it means you want to control and monitor the scraping process. I'll include basic mechanisms for starting/stopping the scraping and potentially reporting status. If you have specific MCP requirements (e.g., a particular protocol or data format), please provide more details. **Code Structure** I'll break the code into three main parts: * **`server.js` (Node.js/Express.js Server):** Handles requests, starts Playwright, and streams URLs via SSE. * **`playwright_scraper.js` (Playwright Script):** The core logic for scraping URLs from a webpage. * **`client.html` (Simple Client):** A basic HTML page to connect to the SSE stream and display the URLs. **1. `server.js` (Node.js/Express.js Server)** ```javascript const express = require('express'); const { spawn } = require('child_process'); // For running Playwright script const cors = require('cors'); // Import the cors middleware const app = express(); const port = 3000; app.use(cors()); // Enable CORS for all routes let scraperProcess = null; // Store the Playwright process let sseClients = []; // Array to hold connected SSE clients // SSE setup app.get('/events', (req, res) => { res.setHeader('Content-Type', 'text/event-stream'); res.setHeader('Cache-Control', 'no-cache'); res.setHeader('Connection', 'keep-alive'); res.flushHeaders(); const clientId = Date.now(); sseClients.push({ id: clientId, res }); console.log(`Client ${clientId} connected`); // Send a ping every 30 seconds to keep the connection alive const pingInterval = setInterval(() => { sseClients.find(client => client.id === clientId)?.res.write(`: ping\n\n`); }, 30000); req.on('close', () => { console.log(`Client ${clientId} disconnected`); sseClients = sseClients.filter(client => client.id !== clientId); clearInterval(pingInterval); }); }); function sendSSE(data) { sseClients.forEach(client => { client.res.write(`data: ${JSON.stringify(data)}\n\n`); }); } // Start scraping endpoint app.post('/start-scraping', (req, res) => { if (scraperProcess) { return res.status(400).send('Scraping already in progress.'); } scraperProcess = spawn('node', ['playwright_scraper.js']); scraperProcess.stdout.on('data', (data) => { const message = data.toString().trim(); try { const parsedData = JSON.parse(message); sendSSE(parsedData); } catch (error) { console.error("Error parsing JSON:", error); console.log("Received data:", message); // Log the raw message sendSSE({ type: 'log', message: message }); // Send as a log message } }); scraperProcess.stderr.on('data', (data) => { console.error(`Scraper error: ${data}`); sendSSE({ type: 'error', message: data.toString() }); }); scraperProcess.on('close', (code) => { console.log(`Scraper process exited with code ${code}`); scraperProcess = null; sendSSE({ type: 'status', message: `Scraping finished with code ${code}` }); }); res.status(202).send('Scraping started.'); // Accepted }); // Stop scraping endpoint app.post('/stop-scraping', (req, res) => { if (!scraperProcess) { return res.status(400).send('Scraping not in progress.'); } scraperProcess.kill('SIGINT'); // Or 'SIGTERM' scraperProcess = null; res.send('Scraping stopped.'); }); app.listen(port, () => { console.log(`Server listening at http://localhost:${port}`); }); ``` **Explanation of `server.js`:** * **Dependencies:** Requires `express`, `child_process`, and `cors`. Install them: `npm install express child_process cors` * **CORS:** `app.use(cors());` enables Cross-Origin Resource Sharing, allowing your client (e.g., a webpage on a different domain) to access the server. Important for web applications. * **`scraperProcess`:** Stores a reference to the Playwright process. This is crucial for stopping the process later. * **`/events` (SSE Endpoint):** * Sets the correct headers for SSE. * Creates a new client object and adds it to the `sseClients` array. * `req.on('close')`: Handles client disconnections, removing the client from the array and clearing the ping interval. * `sendSSE(data)`: Sends data to all connected SSE clients. It stringifies the data as JSON. * **Ping:** Sends a ping every 30 seconds to keep the connection alive. Some browsers or proxies might close idle SSE connections. * **`/start-scraping` (POST):** * Checks if scraping is already in progress. * Uses `child_process.spawn` to run the `playwright_scraper.js` script. This runs the script in a separate process. * **`stdout.on('data')`:** Captures the output from the Playwright script (URLs, logs, etc.). It parses the output as JSON and sends it to the SSE clients. Includes error handling for JSON parsing. * **`stderr.on('data')`:** Captures any errors from the Playwright script and sends them to the SSE clients. * **`on('close')`:** Handles the Playwright process exiting. Clears the `scraperProcess` variable and sends a status message to the clients. * Returns a 202 Accepted status code to indicate that the scraping process has been started. * **`/stop-scraping` (POST):** * Checks if scraping is in progress. * Uses `scraperProcess.kill('SIGINT')` to stop the Playwright process. `SIGINT` is a signal that tells the process to terminate gracefully. You can also use `SIGTERM`. * **Error Handling:** The `stdout.on('data')` includes a `try...catch` block to handle potential errors when parsing the JSON output from the Playwright script. This is important because the script might output log messages or other non-JSON data. * **Logging:** The code includes `console.log` statements to help you debug the server. **2. `playwright_scraper.js` (Playwright Script)** ```javascript const { chromium } = require('playwright'); async function scrapeUrls(url) { const browser = await chromium.launch({ headless: true }); // Set headless to false to see the browser const page = await browser.newPage(); try { await page.goto(url); // Wait for the page to load completely (adjust the timeout if needed) await page.waitForLoadState('networkidle', { timeout: 60000 }); // Extract all links (href attributes) const links = await page.evaluate(() => { const anchors = Array.from(document.querySelectorAll('a')); return anchors.map(anchor => anchor.href).filter(href => href.startsWith('http')); // Filter out relative links }); // Remove duplicate links const uniqueLinks = [...new Set(links)]; // Send each unique link to the server for (const link of uniqueLinks) { console.log(JSON.stringify({ type: 'url', url: link })); // Output as JSON } console.log(JSON.stringify({ type: 'status', message: 'Scraping completed successfully' })); } catch (error) { console.error("Scraping failed:", error); console.log(JSON.stringify({ type: 'error', message: error.message })); // Send error to server } finally { await browser.close(); } } // Get the target URL from environment variables or use a default const targetUrl = process.env.TARGET_URL || 'https://www.example.com'; // Replace with your target URL scrapeUrls(targetUrl); ``` **Explanation of `playwright_scraper.js`:** * **Dependencies:** Requires `playwright`. Install it: `npm install playwright` * **`scrapeUrls(url)`:** The main scraping function. * Launches a Chromium browser in headless mode (no GUI). Set `headless: false` to see the browser window. * Creates a new page. * Navigates to the specified URL using `page.goto(url)`. * **`page.waitForLoadState('networkidle')`:** Waits for the page to load completely. `networkidle` means that there are no network connections for at least 500ms. Adjust the `timeout` if needed. * **`page.evaluate()`:** Executes JavaScript code in the context of the browser page. This is how you access the DOM. * It selects all `<a>` elements. * It extracts the `href` attribute of each link. * It filters out relative links (links that don't start with `http`). * **`new Set(links)`:** Removes duplicate links. * **`console.log(JSON.stringify({ type: 'url', url: link }))`:** This is the crucial part for SSE. It outputs each URL to the console as a JSON string. The server will capture this output and send it to the client. The `type` field is used to identify the type of data being sent (URL, status, error, etc.). * **Error Handling:** The `try...catch` block handles any errors that occur during the scraping process. It sends an error message to the server. * **`finally`:** Ensures that the browser is closed, even if an error occurs. * **`targetUrl`:** Gets the target URL from the `TARGET_URL` environment variable. If the environment variable is not set, it uses a default URL (`https://www.example.com`). This makes the script more configurable. To set the environment variable, you can run the server like this: `TARGET_URL=https://www.example.com node server.js` **3. `client.html` (Simple Client)** ```html <!DOCTYPE html> <html> <head> <title>SSE URL Stream</title> </head> <body> <h1>SSE URL Stream</h1> <ul id="urlList"></ul> <script> const urlList = document.getElementById('urlList'); const eventSource = new EventSource('http://localhost:3000/events'); // Replace with your server URL eventSource.onmessage = (event) => { const data = JSON.parse(event.data); if (data.type === 'url') { const listItem = document.createElement('li'); listItem.textContent = data.url; urlList.appendChild(listItem); } else if (data.type === 'status') { const listItem = document.createElement('li'); listItem.textContent = `Status: ${data.message}`; urlList.appendChild(listItem); } else if (data.type === 'error') { const listItem = document.createElement('li'); listItem.textContent = `Error: ${data.message}`; urlList.appendChild(listItem); } else if (data.type === 'log') { const listItem = document.createElement('li'); listItem.textContent = `Log: ${data.message}`; urlList.appendChild(listItem); } }; eventSource.onerror = (error) => { console.error("SSE error:", error); const listItem = document.createElement('li'); listItem.textContent = `SSE Error: ${error}`; urlList.appendChild(listItem); eventSource.close(); }; </script> <button id="startButton">Start Scraping</button> <button id="stopButton">Stop Scraping</button> <script> const startButton = document.getElementById('startButton'); const stopButton = document.getElementById('stopButton'); startButton.addEventListener('click', () => { fetch('http://localhost:3000/start-scraping', { method: 'POST' }) .then(response => { if (response.ok) { console.log('Scraping started'); } else { console.error('Failed to start scraping:', response.status); } }); }); stopButton.addEventListener('click', () => { fetch('http://localhost:3000/stop-scraping', { method: 'POST' }) .then(response => { if (response.ok) { console.log('Scraping stopped'); } else { console.error('Failed to stop scraping:', response.status); } }); }); </script> </body> </html> ``` **Explanation of `client.html`:** * **SSE Connection:** Creates an `EventSource` object to connect to the `/events` endpoint on the server. * **`eventSource.onmessage`:** Handles incoming SSE messages. * Parses the JSON data from the event. * Checks the `type` field to determine the type of data (URL, status, error). * Creates a new list item and adds it to the `urlList` element. * **`eventSource.onerror`:** Handles SSE errors. * **Start/Stop Buttons:** The HTML includes buttons to start and stop the scraping process. These buttons send POST requests to the `/start-scraping` and `/stop-scraping` endpoints on the server. * **Error Handling:** The client includes basic error handling for SSE connections and HTTP requests. **How to Run the Code** 1. **Install Node.js:** Make sure you have Node.js installed. 2. **Create a Project Directory:** Create a new directory for your project. 3. **Initialize the Project:** Run `npm init -y` in the project directory to create a `package.json` file. 4. **Install Dependencies:** Run `npm install express child_process playwright cors` 5. **Create Files:** Create the `server.js`, `playwright_scraper.js`, and `client.html` files in the project directory. 6. **Run the Server:** Run `node server.js` in the project directory. 7. **Open `client.html`:** Open the `client.html` file in your web browser. 8. **Click "Start Scraping":** Click the "Start Scraping" button to start the scraping process. 9. **See the URLs:** The URLs will be displayed in the list as they are discovered. 10. **Click "Stop Scraping":** Click the "Stop Scraping" button to stop the scraping process. **Important Considerations and Improvements** * **Error Handling:** Implement more robust error handling in both the server and the Playwright script. Log errors to a file or a monitoring service. * **Configuration:** Use environment variables or a configuration file to store settings such as the target URL, the scraping interval, and the log file path. * **Scalability:** For large-scale scraping, consider using a message queue (e.g., RabbitMQ or Kafka) to distribute the scraping tasks across multiple workers. * **Rate Limiting:** Implement rate limiting to avoid overloading the target website. Use `page.waitForTimeout()` to pause between requests. * **User Agent:** Set a realistic user agent to avoid being blocked by the target website. * **Proxies:** Use proxies to avoid being blocked by the target website. * **Data Storage:** Store the scraped data in a database or a file. * **MCP (Management Control Protocol):** If you have specific MCP requirements, you'll need to implement the appropriate protocol and data format. This might involve adding more endpoints to the server to handle MCP commands. * **Headless Mode:** Running in headless mode is generally recommended for performance. However, some websites may require a full browser to render correctly. You can set `headless: false` in the Playwright script to see the browser window. * **Resource Management:** Be mindful of resource usage, especially memory. Close the browser and pages when they are no longer needed. * **Legal and Ethical Considerations:** Always respect the target website's terms of service and robots.txt file. Avoid scraping data that you are not authorized to access. This comprehensive example provides a solid foundation for building a web scraping application with Playwright, Node.js, Express.js, and SSE. Remember to adapt the code to your specific needs and requirements. Good luck!

PC-MCP

PC-MCP

このプロジェクトは、個人のPC上で動作するMCPサーバーを対象としており、現在は主にsmart-pet-with-mcpプロジェクトとの連携デモ用として使用されています。

mcp_server

mcp_server

LLM統合のためのMCPサーバー

shortcuts-mcp-server

shortcuts-mcp-server

shortcuts-mcp-server

SuperMemory MCP

SuperMemory MCP

A tool that makes memories stored in ChatGPT accessible across various language models without requiring logins or paywalls.

DDG MCP2

DDG MCP2

A basic MCP server template built with FastMCP framework that provides example tools for echoing messages and retrieving server information. Serves as a starting point for developing custom MCP servers with Docker support and CI/CD integration.

Azure DevOps MCP Proxy

Azure DevOps MCP Proxy

Enables interaction with Azure DevOps through Personal Access Token authentication. Supports work item management, wiki operations, project/repository listing, and build pipeline access through natural language.

Neo4j MCP Clients & Servers

Neo4j MCP Clients & Servers

Model Context Protocol with Neo4j

XML Documents MCP Server by CData

XML Documents MCP Server by CData

XML Documents MCP Server by CData

EPICS MCP Server

EPICS MCP Server

A Python-based server that interacts with EPICS process variables, allowing users to retrieve PV values, set PV values, and fetch detailed information about PVs through a standardized interface.

MCP Server

MCP Server

A Python implementation of the Model Context Protocol (MCP) that connects client applications with AI models, primarily Anthropic's models, with setup instructions for local development and deployment.

🚀 Go-Tapd-SDK

🚀 Go-Tapd-SDK

Go Tapd SDK は、Tapd API にアクセスするための Go クライアントライブラリであり、最新の MCP サーバーもサポートしています。

VNDB MCP Server

VNDB MCP Server

Visual Novel Database (VNDB) API にアクセスするための Model Context Protocol (MCP) サーバー。これにより、Claude AI がビジュアルノベルの情報を検索および取得できるようになります。

WAHA MCP Server

WAHA MCP Server

Enables AI assistants to interact with WhatsApp through the WAHA (WhatsApp HTTP API) platform. Supports chat management, message operations including sending/receiving messages, and marking chats as read.

MCP REST API Server

MCP REST API Server

A server implementation of the Model Context Protocol (MCP) that provides REST API endpoints for managing and interacting with MCP resources.

Trino MCP Server

Trino MCP Server

Enables database schema analysis and management for Trino servers through dynamic connections. Supports DDL validation, dependency analysis, schema documentation generation, and safe SQL execution with multiple concurrent connections.

Cursor Talk to Figma MCP

Cursor Talk to Figma MCP

Enables Cursor AI to communicate with Figma for reading designs and modifying them programmatically, allowing users to automate design tasks through natural language.

sl-test

sl-test

sl-test

Letter Counter MCP Server

Letter Counter MCP Server

Model Context Protocol (MCP) の学習例として作成された、LLM が単語内の特定の文字の出現回数を数えることを可能にする MCP サーバー。

Rongcloud Native

Rongcloud Native

Native MCP Overview RongCloud Native MCP is a lightweight RongCloud IM service wrapper based on the MCP (Model Control Protocol) protocol. By directly wrapping the high-performance Rust IM SDK, it provides a simple and efficient instant messaging solution for client-side or local applications

MCP-Kit Developer Task Assignment System

MCP-Kit Developer Task Assignment System

Enables intelligent task assignment to developers using hybrid AI algorithms that match tasks based on past experience, skill sets, workload balance, and project alignment. Features enterprise-grade security with AES-256 encryption and 75% performance optimization through smart caching.

ERPNext MCP Server

ERPNext MCP Server

A production-ready server that enables AI assistants like Claude Desktop to seamlessly integrate with ERPNext for document operations, reporting, and custom workflows through natural language interaction.

MCP AgentRun Server

MCP AgentRun Server

Enables safe Python code execution in isolated Docker containers through the AgentRun framework. Provides automatic container lifecycle management and comprehensive error handling for secure and reproducible code execution.

MCP Dust Server

MCP Dust Server

鏡 (Kagami)

GitHub-Jira MCP Server

GitHub-Jira MCP Server

Enables secure integration between GitHub and Jira with permission controls, allowing users to manage repositories, create issues and pull requests, and handle Jira project workflows through natural language. Supports OAuth authentication and comprehensive security enforcement for both platforms.

Polymarket MCP Tool

Polymarket MCP Tool

A Model Context Protocol server that enables interaction with Polymarket prediction markets through Claude Desktop.

MCP Demo Server

MCP Demo Server

A demonstration server based on Model Context Protocol (MCP) that showcases how to build custom tools for AI assistants, providing mathematical calculation and multilingual greeting capabilities.

Databricks MCP Server Template

Databricks MCP Server Template

Enables AI assistants like Claude to interact with Databricks workspaces through a secure, authenticated interface. Supports custom prompts and tools that leverage the Databricks SDK for workspace management, job execution, and SQL operations.

Multi-AI MCP Server for Claude Code

Multi-AI MCP Server for Claude Code

Connects Claude Code with multiple AI models (Gemini, Grok-3, ChatGPT, DeepSeek) simultaneously, allowing users to get diverse AI perspectives, conduct AI debates, and leverage each model's unique strengths.

Gemini Function Calling + Model Context Protocol(MCP) Flight Search

Gemini Function Calling + Model Context Protocol(MCP) Flight Search

Gemini 2.5 Pro を使用したモデルコンテキストプロトコル (MCP)。Gemini の関数呼び出し機能と MCP のフライト検索ツールを使用して、会話形式のクエリをフライト検索に変換します。