lilFetch
Scrapes webpages and converts them to clean Markdown using browser automation, ideal for creating READMEs, documentation, or processing dynamic web content.
README
UNDER CONSTRUCTION
lilFetch
An MCP (Model Context Protocol) server that scrapes webpages using crawl4ai and Playwright for more robust scraping. Tested on CoPilot in VSCode but may work for others.
Features
- Enables html and/or text scraping of one or more urls directly in your chat prompt. Use the scraped response in followup queries for market research summarization, context for new file creation, etc.
- Leverages Playwright and a headless instance of Chromium to load JS heavy sites and web apps where basic
#fetchandcurlcommands fall short. - Strong focus on minimal commands and configuration to install and get scraping.
Prerequisites
Before installing, ensure:
- Node.js 14+: Download from nodejs.org or via Homebrew (
brew install nodeon macOS). - Python 3.10+: Auto-detected during setup. Install from python.org or Homebrew (
brew install python@3.12on macOS). If using pyenv, set a 3.10+ version active (pyenv global 3.12.0). - First Run Time: Setup downloads ~200MB (Playwright browsers) and takes 1-2 minutes.
Installation
Install globally for use across workspaces, or restrict to local installation if you just want to test in this repo or enhance it further.
Option 1: Global Install
For running npx lilfetch from any directory (portable CLI).
-
Clone the Repo
git clone https://github.com/jphdevsf/lilfetch-mcp.git lilfetch-mcp cd lilfetch-mcp -
Install Globally
npm run global-install- Sets up Python venv in
~/.lilfetch-venv(user-wide).
- Sets up Python venv in
-
Configure in Any VS Code Workspace (add to
.vscode/mcp.jsonor global MCP settings):{ "servers": { "lilFetch": { "type": "stdio", "command": "npx", "args": ["lilfetch"] } } } -
Test It
- In new terminal window, run
npx lilfetchto start MCP server. - In VS Code, prompt with something like...
Use lilFetch to scrape top news headlines from www.cnn.com and write to a markdown file in root of my repo. - In new terminal window, run
Option 2: Local Install
For testing/extending in the repo.
-
Clone the Repo
git clone https://github.com/jphdevsf/lilfetch-mcp.git lilfetch-mcp cd lilfetch-mcp -
Install Locally
npm install- Sets up
./node_modules/lilfetch/and.bin/lilfetch. - Python venv in repo
.venv(local to this project).
- Sets up
-
MCP.json Workspace Configuration Navigate to
.vscode/mcp.json(create if missing) and add:{ "servers": { "lilFetch": { "type": "stdio", "command": "node", "args": ["bin/lilfetch.js"] } } }Note: Ensure
bin/lilfetch.jsis executable: Runchmod +x bin/lilfetch.jsin the terminal. -
Test It
- In new terminal window, navigate to this repo and run
npm run devor./node_modules/.bin/lilfetch. - In VS Code, prompt with something like...
Use lilFetch to scrape top news headlines from www.cnn.com and write to a markdown file in root of my repo. - In new terminal window, navigate to this repo and run
Uninstallation
To fully remove a global installation (including the npm package and Python virtual environment):
Global Uninstall
From the repo directory (or anywhere):
npm run global-uninstall
- This runs
npm uninstall -g lilfetchto remove the global npm package and binary. - Followed by
rm -rf ~/.lilfetch-venvto delete the user-wide Python venv (including installed deps and Playwright browsers). - Warning: The
rm -rfcommand is irreversible. It only affects the.lilfetch-venvdirectory in your home folder. Back up if needed (unlikely).
Manual Uninstall (Alternative)
- Remove npm package:
npm uninstall -g lilfetch - Remove Python venv:
rm -rf ~/.lilfetch-venv- On Windows:
rmdir /s /q %USERPROFILE%\.lilfetch-venv
- On Windows:
Local Uninstall
For local installs (e.g., after npm install):
npm uninstall
rm -rf .venv
- This removes the local Node modules and repo-specific Python venv.
Verification
npm list -g --depth=0(nolilfetch).which lilfetch(empty).ls ~/.lilfetch-venv(no such file).
For local: rm -rf node_modules .venv and verify no ./.bin/lilfetch.
Development
- Edit
mcp_server.pyfor Python logic. - Update
bin/lilfetch.jsfor wrapper changes. - Bump version in
package.json, thennpm run pack. - For global testing:
npm install -g .thennpx lilfetch.
Troubleshooting
- Permission Errors (Global Install): See Prerequisites for user-owned NPM setup. Avoid sudo—use the config steps.
- Python Not Found/Version Error: Ensure Python 3.8+ is in PATH. For pyenv:
pyenv install 3.12.0 && pyenv global 3.12.0, then re-run install. Check:python3 --version. - Venv/Deps Fail: For local: Delete
.venvand re-runnpm install. For global: Delete~/.lilfetch-venvand re-runnpm install -g .. Manual fix (local):python3 -m venv .venv && .venv/bin/pip install -r requirements.txt && .venv/bin/python -m playwright install. Manual fix (global):python3 -m venv ~/.lilfetch-venv && ~/.lilfetch-venv/bin/pip install -r requirements.txt && ~/.lilfetch-venv/bin/python -m playwright install. - Playwright Browsers Missing: Run
python -m playwright installin the venv (or manually as logged). - MCP Not Detected in VS Code: Restart VS Code after config; ensure workspace is open correctly.
- Uninstall:
- Global:
npm uninstall -g lilfetch+rm -rf ~/.lilfetch-venv. - Local:
rm -rf node_modules package-lock.json .venv.
- Global:
License: MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
E2B
Using MCP to run code via e2b.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Neon Database
MCP server for interacting with Neon Management API and databases