Amazon Q Web Documentation Reader
Enables intelligent navigation and extraction of documentation from websites, allowing Amazon Q to automatically discover relevant pages, extract clean content, and retrieve code examples from web documentation.
README
<div align="center">
๐ Amazon Q Web Documentation Reader
MCP Server for Intelligent Web Content Extraction
<p align="center"> <strong>A Model Context Protocol (MCP) server that enables Amazon Q to intelligently navigate and extract documentation from websites.</strong> <br> <em>Amazon Q uses Claude 4.5 to make smart decisions about which pages to visit and what content to extract.</em> </p>
Features โข Installation โข Setup โข Usage โข Tools
</div>
โจ Features
- ๐ง Intelligent Navigation - Amazon Q (Claude 4.5) decides which documentation pages to visit
- ๐งน Clean Content Extraction - Removes navigation, ads, scripts, and other non-content elements
- ๐ Multiple Output Formats - Supports both Markdown and plain text output
- ๐ป Code Block Extraction - Specifically extracts code examples from documentation
- ๐ Page Structure Analysis - Extracts heading hierarchy and table of contents
- ๐ Link Discovery - Finds and filters documentation links
- ๐ Batch Processing - Read multiple documentation pages at once
๐ฏ How It Works
User: "I'm having issues with Razorpay routes"
Documentation: https://razorpay.com/docs
Amazon Q (Claude 4.5):
1. Reads main docs page
2. Sees links: ["Payments", "Routes", "Webhooks", ...]
3. Intelligently decides: "Routes link is relevant!"
4. Navigates to Routes documentation
5. Extracts content and solves your problem
All navigation decisions = Amazon Q's Claude brain ๐ง
MCP Server = Clean content extraction tool ๐ ๏ธ
๐ฆ Installation
Prerequisites
- Python 3.12 or higher
- uv (recommended) or pip
- Amazon Q CLI
Step 1: Clone the Repository
git clone https://github.com/yourusername/amazon-q-web_search.git
cd amazon-q-web_search
Step 2: Install Dependencies
Using uv (Recommended):
uv sync
Using pip:
pip install -e .
๐ง Setup with Amazon Q
Step 1: Locate Your MCP Configuration File
Amazon Q looks for MCP server configuration in:
- Linux/WSL:
~/.aws/amazonq/mcp.json - macOS:
~/.aws/amazonq/mcp.json - Windows:
%USERPROFILE%\.aws\amazonq\mcp.json
Step 2: Create/Edit the Configuration File
Create the directory if it doesn't exist:
mkdir -p ~/.aws/amazonq
Edit or create ~/.aws/amazonq/mcp.json:
For Linux/WSL:
{
"mcpServers": {
"doc_reader": {
"command": "/full/path/to/amazon-q-web_search/.venv/bin/python",
"args": ["/full/path/to/amazon-q-web_search/main.py"]
}
}
}
For macOS:
{
"mcpServers": {
"doc_reader": {
"command": "/full/path/to/amazon-q-web_search/.venv/bin/python",
"args": ["/full/path/to/amazon-q-web_search/main.py"]
}
}
}
For Windows:
{
"mcpServers": {
"doc_reader": {
"command": "C:\\full\\path\\to\\amazon-q-web_search\\.venv\\Scripts\\python.exe",
"args": ["C:\\full\\path\\to\\amazon-q-web_search\\main.py"]
}
}
}
๐ก Tip: Replace /full/path/to/ with the actual path where you cloned the repository.
Step 3: Verify Installation
-
Start Amazon Q CLI:
q chat -
Check if MCP server is loaded:
/mcpYou should see:
doc_reader - read_web_documentation - get_documentation_links - get_page_structure - extract_code_examples - read_multiple_docs -
If not loaded:
- Check the file path in
mcp.jsonis correct - Restart Amazon Q CLI
- Check logs:
q chat logdump
- Check the file path in
๐ Usage
Basic Example
In Amazon Q CLI, simply ask about documentation:
I'm having issues with Razorpay routes. Can you help me understand how they work?
Documentation: https://razorpay.com/docs/
Amazon Q will:
- โ Read the main documentation page
- โ Extract all available links
- โ Intelligently identify the "Routes" link
- โ Navigate to the Routes documentation
- โ Provide you with accurate information
More Examples
Python Documentation:
Can you explain Python asyncio event loops?
Documentation: https://docs.python.org/3/library/asyncio.html
FastAPI Tutorial:
How do I create a basic FastAPI application?
Documentation: https://fastapi.tiangolo.com/
AWS Lambda:
How do I create a Lambda function with Python?
Documentation: https://docs.aws.amazon.com/lambda/
๐ Available Tools
Amazon Q intelligently chains these tools to navigate documentation:
1. read_web_documentation
Fetches and extracts clean documentation content from a web page.
Parameters:
url(required): The URL of the documentation pageoutput_format(optional):"markdown"(default) or"text"
Returns: Extracted documentation content with title and metadata
2. get_documentation_links
Extracts all links from a documentation page with optional filtering.
Parameters:
url(required): The URL of the documentation pagefilter_pattern(optional): Pattern to filter links (e.g.,"api","guide")
Returns: List of links found on the page
3. get_page_structure
Extracts the heading structure and table of contents from a documentation page.
Parameters:
url(required): The URL of the documentation page
Returns: Hierarchical structure of headings on the page
4. extract_code_examples
Extracts all code blocks from a documentation page.
Parameters:
url(required): The URL of the documentation page
Returns: All code blocks found with their detected languages
5. read_multiple_docs
Reads multiple documentation pages and combines their content.
Parameters:
urls(required): List of documentation URLs (max 10)
Returns: Combined content from all pages
๐ Project Structure
amazon-q-web_search/
โโโ main.py # Entry point
โโโ pyproject.toml # Project configuration
โโโ README.md # This file
โโโ run_mcp.sh # Startup script (Linux/macOS)
โโโ src/
โโโ __init__.py # Package initialization
โโโ server.py # MCP server initialization
โโโ config.py # Configuration constants
โโโ fetcher.py # HTTP fetching logic
โโโ extractor.py # HTML content extraction
โโโ formatters.py # Output formatting
โโโ tools.py # MCP tool definitions
โ๏ธ Configuration
Edit src/config.py to customize behavior:
| Setting | Default | Description |
|---|---|---|
HTTP_TIMEOUT |
30.0s | Request timeout in seconds |
MAX_CONTENT_LENGTH |
10MB | Maximum content size in bytes |
USER_AGENT |
Custom | HTTP User-Agent string |
REMOVE_TAGS |
Various | HTML tags to remove during extraction |
CONTENT_SELECTORS |
Various | Selectors for finding main content |
๐ Troubleshooting
MCP Server Not Loading
Check configuration:
cat ~/.aws/amazonq/mcp.json
Verify paths are correct:
- Use absolute paths, not relative
- Check that Python executable exists
- Check that main.py exists
Test server manually:
cd /path/to/amazon-q-web_search
.venv/bin/python main.py
Check Amazon Q logs:
q chat logdump
Server Starts But Tools Don't Work
Verify dependencies are installed:
cd /path/to/amazon-q-web_search
.venv/bin/python -c "import httpx, bs4, markdownify; print('OK')"
Reinstall dependencies:
uv sync --reinstall
Connection Timeout
Increase timeout in settings:
q settings mcp.initTimeout 60000
๐ Dependencies
| Package | Purpose |
|---|---|
| httpx | Async HTTP client for fetching web pages |
| beautifulsoup4 | HTML parsing and navigation |
| lxml | Fast XML/HTML parser |
| markdownify | HTML to Markdown conversion |
| mcp | Model Context Protocol SDK |
โ ๏ธ Limitations
| Limit | Value |
|---|---|
| Maximum content size | 10MB per page |
| Maximum URLs per batch | 10 |
| Request timeout | 30 seconds |
| Content type | HTML only |
๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ฌ Support
- ๐ซ Open an Issue for bug reports or feature requests
- โญ Star this repo if you find it useful!
<div align="center"> <sub>Built with โค๏ธ for Amazon Q Developer</sub> </div>
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.