Bookmark Geni MCP Server
Enables semantic search across browser bookmarks from Chrome, Firefox, Edge, Opera, and other browsers using natural language queries. Extracts and indexes bookmark content and metadata into a vector database for intelligent retrieval.
README
Bookmark Geni MCP Server
Stand-alone MCP (Model Context Protocol) server for processing local browser bookmarks and making url/bookmarks easily searchable using natural language.
Connect to Claud Desk top or Gemini CLI <img width="832" height="595" alt="Screenshot 2025-11-28 at 10 16 08 PM" src="https://github.com/user-attachments/assets/99c368bd-127e-4f08-8e04-0fbfd7322146" />
<img width="835" height="614" alt="Screenshot 2025-11-28 at 10 19 21 PM" src="https://github.com/user-attachments/assets/978e519a-4327-44d7-a8a6-31d250ebcec9" />
<img width="842" height="606" alt="Screenshot 2025-11-28 at 10 19 49 PM" src="https://github.com/user-attachments/assets/8be42616-c138-4d98-81ee-0765c0ddfc73" />
Features
- 🔍 Multi-Browser Support: Reads bookmarks from Chrome, Edge, Firefox, Opera, ChatGPT Atlas, and Perplexity Comet
- 📄 Content Extraction: Fetches HTML content from URLs and extracts text for semantic search
- 🏷️ Metadata Extraction: Extracts descriptions from HTML metadata tags (Open Graph, meta description, title)
- 📊 Vector Storage: Stores bookmark embeddings in ChromaDB using sentence transformer model (all-MiniLM-L6-v2)
- 🔎 RAG Search: Query bookmarks using natural language with metadata filtering
- 📦 Portability: Export and import embeddings to/from pickle files for easy transfer
- ⚡ Performance: Batch processing with concurrency and caching
Installation
-
Install dependencies:
pip install -r requirements.txt -
Configure the server by editing
config.yaml(optional - defaults are provided) -
Make start script executable:
chmod +x scripts/start_mcp_server.sh
Usage
Once the server is started, it can be used with any MCP client. To index all browser bookmarks and generate metadata, run the following command:
"Generate metadata for chrome bookmarks"
Once all bookmarks are indexed, you can query them using the following command:
"Query bookmarks for 'python'"
Standalone MCP Server
The server can be used independently with any MCP client by referencing mcp.json:
# Start the server using the bash script
./scripts/start_mcp_server.sh
With Gemini CLI
Connect the server to the Gemini CLI using the following command:
# Add the server configuration to your Gemini CLI settings
# Edit ~/.gemini/settings.json and add the following to the "mcpServers" section:
{
"mcpServers": {
"bookmarkGeni": {
"command": "bash",
"args": ["/path/to/bookmark_geni_mcp/scripts/start_mcp_server.sh"],
"env": {
"PYTHON_PATH": "/path/to/your/python3"
}
}
}
}
# Or use the provided mcp.json as a reference for the configuration
With Calude Desktop
Connect the server to the Calude Desktop by adding the following to the Calude Desktop settings:
{
"mcpServers": {
"bookmarkGeni": {
"command": "bash",
"args": ["/path/to/bookmark_geni_mcp/scripts/start_mcp_server.sh"],
"env": {
"PYTHON_PATH": "/usr/bin/python3"
}
}
}
}
Note: Replace /path/to/bookmark_geni_mcp with the actual path to this repository and /path/to/your/python3 with your Python interpreter path.
Configuration
The server reads configuration from config.yaml in the MCP server root directory. This includes:
- Browser enable/disable settings
- ChromaDB path (relative to MCP server root or absolute path)
- Metadata JSONL path (relative to MCP server root or absolute path)
- URL processing limit (default: -1, meaning process all URLs)
- Debug mode
Example config.yaml:
debug: false
browsers:
Chrome:
enabled: true
Edge:
enabled: true
# Optional: Override default path detection
# paths:
# - "/path/to/custom/Bookmarks"
chromaDbPath: ".chromadb"
metadataJsonlPath: "data/bookmarks_metadata.jsonl"
urlLimit: -1 # -1 means process all, set to positive number to limit
The server is now completely independent and does not require the Gemini CLI extension folder.
Browser Support
The server supports the following browsers:
- Chrome: Windows, macOS, Linux
- Edge: Windows, macOS, Linux
- Firefox: Windows, macOS, Linux
- Opera: Windows, macOS, Linux
- ChatGPT Atlas: macOS (Chromium-based)
- Perplexity Comet: Windows, macOS, Linux (Chromium-based)
Note: Safari is not supported because reading Bookmarks.plist requires special macOS permissions that are not granted by default. To use Safari bookmarks, you would need to grant Full Disk Access permissions to the Python interpreter, which is not recommended for security reasons.
Tools
The server provides the following MCP tools:
-
generate_bookmarks_metadata- Scans selected browsers for bookmarks
- Fetches HTML content and generates metadata
- Creates embeddings and stores them in ChromaDB
- Parameters:
browsers(e.g., "Chrome,Safari" or "All")
-
query_bookmarks- Performs semantic search on stored bookmarks
- Supports metadata filtering
- Parameters:
query: Search textlimit: Max results (default 10)where: Filter dict (e.g.,{"folder": "Work"})
-
list_browsers- Lists installed browsers and their detected bookmark file paths
- Parameters: None
-
get_stats- Returns database statistics (total count, collection info)
- Parameters: None
-
export_embeddings- Exports all data to a pickle file for backup or transfer
- Parameters:
pickle_path(optional)
-
import_embeddings- Imports data from a pickle file
- Parameters:
pickle_path(required)
See mcp.json for detailed schema definitions.
Workflow
flowchart TD
%% Nodes
User([User / CLI])
Server[MCP Server]
Detector[Browser Detector]
Parser[Bookmark Parser]
Generator[Metadata Generator]
URLTracker[URL Tracker]
VectorStore[Bookmark Vector Store]
SearchModule[Semantic Search Module]
ChromaDB[(ChromaDB)]
JSONL[(JSONL Storage)]
%% Flow
User -->|generate| Server
Server -->|Get Paths| Detector
Detector -->|Browser Paths| Server
subgraph Processing [Processing Loop]
direction TB
Server -->|Parse File| Parser
Parser -->|Raw Bookmarks| Server
Server -->|Check Processed| URLTracker
URLTracker -->|Filter New| Server
Server -->|Batch Process| Generator
Generator -->|Fetch HTML| Generator
Generator -->|Extract Metadata| Generator
Generator -->|Enriched Bookmarks| Server
Server -->|Store| VectorStore
VectorStore -->|Generate Embeddings| SearchModule
SearchModule -->|Store Vectors| ChromaDB
Server -->|Write Metadata| JSONL
Server -->|Track URLs| URLTracker
end
Server -->|JSON Result| User
User -->|query| Server
Server -->|Search| VectorStore
VectorStore -->|Semantic Search| SearchModule
SearchModule -->|Query Vectors| ChromaDB
ChromaDB -->|Results| SearchModule
SearchModule -->|Ranked Results| VectorStore
VectorStore -->|Bookmarks| Server
Server -->|JSON Results| User
%% Styling
style User fill:#ff9999,stroke:#333,stroke-width:2px
style Server fill:#99ccff,stroke:#333,stroke-width:2px
style Detector fill:#99ff99,stroke:#333,stroke-width:2px
style Parser fill:#ffff99,stroke:#333,stroke-width:2px
style Generator fill:#ffcc99,stroke:#333,stroke-width:2px
style VectorStore fill:#cc99ff,stroke:#333,stroke-width:2px
style SearchModule fill:#ff99cc,stroke:#333,stroke-width:2px
style URLTracker fill:#99ffcc,stroke:#333,stroke-width:2px
style ChromaDB fill:#9999ff,stroke:#333,stroke-width:2px
style JSONL fill:#cccc99,stroke:#333,stroke-width:2px
style Processing fill:#f9f9f9,stroke:#666,stroke-dasharray: 5 5
Structure
bookmark_geni_mcp/
├── config.yaml # Server configuration file
├── mcp.json # MCP server configuration
├── pyproject.toml # Project configuration
├── requirements.txt # Python dependencies
├── servers/
│ └── bookmark_server.py # MCP server implementation
├── scripts/
│ └── start_mcp_server.sh # Bash start script
└── src/
├── browser_detector.py # Browser path detection
├── bookmark_parser.py # Bookmark file parsing
├── metadata_generator.py # HTML content and metadata extraction
├── bookmark_vector_store.py # Bookmark-specific vector store wrapper
├── metadata_storage.py # JSONL file storage
├── config.py # Configuration management
└── search/ # Semantic search module
├── __init__.py
├── semantic_search.py
├── vector_store.py
├── embeddings.py
└── config.py
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Neon Database
MCP server for interacting with Neon Management API and databases