web-search-cli

web-search-cli

Enables web search using Bing and DuckDuckGo combined, and fetching readable content from web pages, with no API key required.

Category
Visit Server

README

<div align="center">

🔍 web-search-cli

双引擎(Bing + DuckDuckGo)网页搜索与正文抓取工具

既能作为 MCP 服务器接入 AI Agent,也能作为独立命令行工具直接使用。

无需 API Key · 无速率限制 · 代理可选

License: MIT Node MCP PRs Welcome

中文 · English

</div>


✨ 简介

web-search-cli 直接解析 BingDuckDuckGo 的公开 HTML 搜索页,解码它们的跳转链接、跨引擎去重,返回干净的结构化结果。两种用法:

  • 🧩 作为 MCP 服务器 —— 接入 Claude、Cline 或任意兼容 MCP 的客户端,给你的 Agent 配上不受限的联网搜索能力;
  • 💻 作为命令行工具 —— 在终端里直接搜索、抓取网页正文。

🎯 特性

特性 说明
🔀 双引擎合并 一次调用可查 Bing、DuckDuckGo 或两者;both 模式自动按域名+标题去重
🚀 无速率限制 不依赖收费搜索 API,直连公开 HTML 端点
📄 正文提取 fetch_page 抽取任意网页的可读正文(剥离脚本、导航、页脚等)
🀄 中英双语 中文查询自动切换 Bing 到 zh-CN 市场
🌐 代理可选 HTTPS_PROXY / HTTP_PROXY 即走代理,不设则直连。不内置任何代理地址
🔑 零密钥 开箱即用,不需要注册或申请 API Key

📦 环境要求

  • Node.js >= 18(依赖全局 fetchundici)

🛠️ 安装

git clone https://github.com/AusertDream/web-search-cli.git
cd web-search-cli
npm install

想在任意目录直接用 web-search 命令,全局链接一下:

npm link

💻 命令行用法

# 搜索(默认:10 条结果、lang=en、engine=both)
web-search search "rust async runtime comparison"

# 更多结果 + 中文 + 仅用 Bing
web-search search "开源大模型推理框架" 15 zh-CN bing

# 抓取并提取网页正文(默认上限 8000 字符)
web-search fetch "https://example.com/article" 5000

没执行 npm link 时直接调用脚本:

node cli.mjs search "你的查询"

参数顺序:

命令 位置参数
search <查询词> [结果数] [语言] [引擎]
fetch <网址> [最大字符数]

引擎 可选 bing / duckduckgo / both,默认 both

🧩 作为 MCP 服务器使用

服务器通过 stdio 走 MCP 协议。在客户端的 MCP 配置中加入即可。以 Claude / Cline 的 mcp.json 为例:

{
  "mcpServers": {
    "web-search": {
      "command": "node",
      "args": ["/绝对路径/web-search-cli/server.js"]
    }
  }
}

需要走代理时,补一个 env 块:

{
  "mcpServers": {
    "web-search": {
      "command": "node",
      "args": ["/绝对路径/web-search-cli/server.js"],
      "env": {
        "HTTPS_PROXY": "http://127.0.0.1:7890",
        "HTTP_PROXY": "http://127.0.0.1:7890"
      }
    }
  }
}

🧰 提供的工具

web_search —— 网页搜索

参数 类型 默认 说明
query string 查询词(中英文均可)。必填
engine string both bing / duckduckgo / both
num_results number 10 返回结果数上限(1–20)
lang string en 搜索语言,如 enzh-CN

fetch_page —— 抓取并提取网页正文

参数 类型 默认 说明
url string 目标网址。必填
max_length number 8000 返回正文的最大字符数

⚙️ 工作原理

  • Bing:解析 bing.com/search 中的 li.b_algo 区块,并把 base64 编码的 ck/a?u= 跳转链接还原成真实 URL。
  • DuckDuckGo:解析 html.duckduckgo.com/html/ 中的 .result 区块,解开 uddg= 跳转参数,并跳过广告结果。
  • both:合并两个引擎的结果,再按 域名 + 标题 去重。

⚠️ 由于依赖搜索引擎的公开 HTML 结构,若它们改版,解析逻辑可能需要相应更新。

🌐 关于代理

代理由环境变量控制:设置 HTTPS_PROXYHTTP_PROXY 即启用,不设则直连。代码中不硬编码任何代理地址

🇨🇳 国内用户请注意:必须配置代理。 Bing 与 DuckDuckGo 在中国大陆网络环境下通常无法直接访问(DuckDuckGo 被屏蔽,Bing 国际版亦不稳定),不挂代理会连接超时、搜不到结果。请务必先设置代理再使用:

export HTTPS_PROXY=http://127.0.0.1:7890   # 改成你自己的代理地址端口
export HTTP_PROXY=http://127.0.0.1:7890

作为 MCP 服务器时,把代理写进 mcp.jsonenv 块即可(见上文 MCP 配置示例)。

🤝 贡献 / Contributing

欢迎 Issue 与 PR。引擎改版导致解析失效时,修复解析器的 PR 尤其受欢迎。

📄 许可证

MIT © AusertDream

<br>


<div align="center">

🔍 web-search-cli  (English)

</div>

🌟 Introduction

A dual-engine (Bing + DuckDuckGo) web search and page-fetch tool. It scrapes the public HTML search pages of Bing and DuckDuckGo, decodes their redirect URLs, deduplicates across engines, and returns clean structured results. Two ways to use it:

  • 🧩 As an MCP server — plug it into Claude, Cline, or any MCP-compatible client to give your agent unrestricted web search;
  • 💻 As a standalone CLI — run searches and extract page text right from your terminal.

No API keys required.

🎯 Features

Feature Description
🔀 Two engines, one call Query Bing, DuckDuckGo, or both. both merges and deduplicates by domain + title
🚀 No rate-limit gates Hits the public HTML endpoints directly instead of a paid API
📄 Page extraction fetch_page pulls the main readable text out of any URL (strips scripts, nav, footers)
🀄 CJK-aware Chinese queries automatically switch Bing to the zh-CN market
🌐 Optional proxy Set HTTPS_PROXY / HTTP_PROXY to use a proxy; unset for direct connections. No proxy is hardcoded
🔑 Zero keys Works out of the box, no signup

📦 Requirements

  • Node.js >= 18 (uses the global fetch and undici).

🛠️ Install

git clone https://github.com/AusertDream/web-search-cli.git
cd web-search-cli
npm install

To use the web-search command anywhere, link it globally:

npm link

💻 CLI usage

# Search (defaults: 10 results, lang=en, engine=both)
web-search search "rust async runtime comparison"

# More results, Chinese, Bing only
web-search search "开源大模型推理框架" 15 zh-CN bing

# Fetch and extract a page's main text (default max 8000 chars)
web-search fetch "https://example.com/article" 5000

Without npm link, call the script directly:

node cli.mjs search "your query"

Argument order:

Command Positional args
search <query> [num_results] [lang] [engine]
fetch <url> [max_length]

engine is one of bing / duckduckgo / both (default both).

🧩 MCP server usage

The server speaks MCP over stdio. Add it to your client's MCP config. Example for Claude / Cline (mcp.json):

{
  "mcpServers": {
    "web-search": {
      "command": "node",
      "args": ["/absolute/path/to/web-search-cli/server.js"]
    }
  }
}

To route through a proxy, add an env block:

{
  "mcpServers": {
    "web-search": {
      "command": "node",
      "args": ["/absolute/path/to/web-search-cli/server.js"],
      "env": {
        "HTTPS_PROXY": "http://127.0.0.1:7890",
        "HTTP_PROXY": "http://127.0.0.1:7890"
      }
    }
  }
}

🧰 Exposed tools

web_search — search the web.

Param Type Default Description
query string Search query (English or Chinese). Required
engine string both bing / duckduckgo / both
num_results number 10 Max results to return (1–20)
lang string en Search language, e.g. en, zh-CN

fetch_page — fetch and extract readable text from a URL.

Param Type Default Description
url string URL to fetch. Required
max_length number 8000 Max characters of returned content

⚙️ How it works

  • Bing: parses li.b_algo blocks from bing.com/search and decodes the base64 ck/a?u= redirect wrappers back to real URLs.
  • DuckDuckGo: parses .result blocks from html.duckduckgo.com/html/ and unwraps the uddg= redirect params, skipping ad results.
  • both: concatenates results from both engines, then deduplicates by domain + title.

⚠️ Because this relies on the engines' public HTML markup, result parsing may need updating if they change their page structure.

🌐 Proxy

Proxy is controlled entirely by environment variables. Set HTTPS_PROXY or HTTP_PROXY to enable it; leave them unset for direct connections. No proxy address is hardcoded.

🇨🇳 Users in mainland China: a proxy is required. Bing and DuckDuckGo are generally not reachable from within mainland China (DuckDuckGo is blocked, and international Bing is unstable), so without a proxy you'll hit connection timeouts and get no results. Set a proxy first:

export HTTPS_PROXY=http://127.0.0.1:7890   # replace with your own proxy host:port
export HTTP_PROXY=http://127.0.0.1:7890

When running as an MCP server, put the proxy in the env block of mcp.json (see the MCP config example above).

🤝 Contributing

Issues and PRs welcome — especially parser fixes when an engine changes its markup.

📄 License

MIT © AusertDream

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured