Apeiron

Apeiron

Unlimited web access MCP server for AI agents, enabling search, fetch, and learning across multiple sources while bypassing anti-bot protections.

Category
Visit Server

README

<p align="center"> <img src="images/header.jpg" width="100%" alt="Apeiron"> </p>

<h1 align="center">Apeiron</h1>

<p align="center"> <b>Local-first web search, fetch, and extraction tools for AI agents.</b> </p>

<p align="center"> <img src="https://img.shields.io/badge/license-MIT-000?style=flat-square" alt="license"> <img src="https://img.shields.io/badge/Python-3.10+-000?style=flat-square&logo=python" alt="python"> <img src="https://img.shields.io/badge/MCP-server-000?style=flat-square" alt="mcp"> </p>

Apeiron gives MCP-compatible agents and Python apps three practical tools:

  • apeiron_search: search web-oriented sources such as arXiv, Wikipedia, GitHub, and optional local SearXNG.
  • apeiron_fetch: fetch a URL and return LLM-ready content plus tier/verdict diagnostics.
  • apeiron_learn: remember the best working fetch strategy for a domain.

It is designed for Claude Code, OpenCode, Cursor, Cline, Windsurf, and local agent workflows where you want a free, inspectable web-access layer before reaching for paid scraping APIs.

Status

Works today:

  • CLI, Python API, and MCP server surfaces.
  • Fast HTTP fetch with curl_cffi when the fetch extra is installed.
  • arXiv, Wikipedia, and GitHub search.
  • Jina Reader fallback.
  • Local response cache and per-domain strategy cache.
  • Structured JSON output for CLI and MCP fetch/learn calls.
  • apeiron doctor diagnostics for optional dependencies and local services.

Experimental:

  • Browser tiers: Patchright, CloakBrowser, Camoufox, FlareSolverr, browser-use.
  • PDF/DOCX/PPTX/XLSX extraction through Markitdown.
  • YouTube transcript extraction through yt-dlp subtitle metadata.
  • Reddit search; it is not enabled by default because Reddit requires OAuth for reliable automated use.
  • Git-based shared learning; local git commits are opt-in with APEIRON_GIT_COMMIT=true.

Install

The PyPI name apeiron belongs to a different package. Until this project is published as apeiron-agent, use the GitHub install path.

Recommended

pipx install "git+https://github.com/insomnia-me/apeiron.git"

From source

git clone https://github.com/insomnia-me/apeiron.git
cd apeiron
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[fetch,mcp,documents,media]"

One-command local install

curl -fsSL https://raw.githubusercontent.com/insomnia-me/apeiron/main/install.sh | bash

Set APEIRON_INSTALL_PROFILE=all before running the script if you also want browser automation dependencies.

Quickstart

apeiron doctor
apeiron fetch "https://example.com" --json
apeiron search "python web scraping" --sources wikipedia github arxiv --json
apeiron learn "https://example.com" --json

MCP server

Example OpenCode config:

{
  "mcp": {
    "servers": {
      "apeiron": {
        "command": "python",
        "args": ["-m", "apeiron.api.mcp_server"],
        "cwd": "/path/to/apeiron"
      }
    }
  }
}

MCP tools:

Tool What it returns
apeiron_search("query") JSON array of search hits
apeiron_fetch("url") JSON object with content, tier, verdict, content type, title, elapsed time, and error
apeiron_learn("url") JSON object with learned tier/verdict diagnostics

Python API

from apeiron import fetch_sync, search_sync

result = fetch_sync("https://example.com", cache_ttl=0)
print(result.verdict.value, result.tier.value)
print(result.content[:500])

hits = search_sync("agent web access", max_results=5)
for hit in hits:
    print(hit.source.value, hit.title, hit.url)

Architecture

APEIRON
  search
    arXiv, Wikipedia, GitHub, optional SearXNG
  fetch
    fast HTTP -> browser tiers -> reader fallback
  extract
    Trafilatura, Readability, Markitdown
  learn
    strategies.json, challenge heuristics, opt-in git commits
  api
    CLI, Python API, MCP server

Optional infrastructure

SearXNG and FlareSolverr run through Docker Compose:

bash scripts/start-infra.sh
bash scripts/stop-infra.sh

Docker is optional. Apeiron can run CLI, Python API, MCP, fast fetch, and direct API search without local Docker services.

Safety boundary

Apeiron is for fetching public URLs and converting public content into agent-friendly text. It does not authorize credential bypass, private data access, or ignoring site policies. See SECURITY.md.

Roadmap

  • Green deterministic CI on every pull request.
  • More tests around fetch tier selection and extraction.
  • Benchmark table with dated results and reproducible commands.
  • Better browser-tier diagnostics.
  • Explicit Reddit OAuth integration or removal from public source list.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured