roshan-baaz-mcp
Enables semantic search and retrieval for Persian content through Roshan AI's Baaz service, wrapped as MCP tools with support for multi-instance self-hosted deployments.
README
<div align="center">
<img src="assets/banner.svg" alt="roshan-baaz-mcp" width="100%" />
<img src="assets/icons/baaz.svg" height="100" alt="Baaz icon"/>
roshan-baaz-mcp
A self-hostable Model Context Protocol server for Roshan AI's Baaz (باز) Persian semantic-search service.
<sub>Built from the public API documentation at <a href="https://docs.roshan-ai.ir">docs.roshan-ai.ir</a>. Unofficial community integration.</sub>
</div>
What is this?
Baaz (باز) is Roshan AI's Persian-native semantic search and retrieval service. You give it documents; it chunks them, computes embeddings, and stores them across an Elasticsearch + Weaviate backend. You can then run semantic queries, find similar documents, check document status, fetch full documents, and manage indices — all tuned for Persian (فارسی) content.
roshan-baaz-mcp wraps that HTTP API as MCP tools so Claude — or any
MCP-compatible client — can call Baaz as first-class tools.
Baaz is self-hostable, and a single deployment is rarely enough: teams run
many independent Baaz instances (per region, per tenant, per environment).
This server treats named instances as a core concept — one MCP process can
route every tool call to the right Baaz deployment via an optional instance
argument.
Highlights
- 100% endpoint coverage — every documented Baaz endpoint is a tool.
- Multi-instance / self-hosting first — route by
instance; tokens are never exposed bylist_instances. - Guardrails — http(s) URL validation, ~10MB body cap, pagination clamps, and token redaction in every error path.
- Offline docs tool —
roshan_baaz_docsdescribes the service and tools without any network call.
Tool reference
Every tool accepts an optional instance: str | None selecting which configured
Baaz deployment to call (defaults to default_instance).
| Tool | Method & endpoint | Purpose |
|---|---|---|
baaz_index |
POST /{index}/index |
Index/update a batch of documents (chunked + embedded). |
baaz_semantic_search |
POST /{index}/document/semantic_query |
Semantic query → ranked documents with matching chunks. |
baaz_document_status |
POST /{index}/document_status |
Check whether documents (by URL) already exist. |
baaz_delete_index |
DELETE /{index}/delete_index |
Delete an entire index and all its documents. |
baaz_delete_documents |
DELETE /{index}/delete_index (body {type}) |
Delete only documents of one type. |
baaz_stats |
GET /{index}/stats |
Index statistics keyed by {index}_{type}. |
baaz_similar_documents |
POST /{index}/similar/documents |
Find documents similar to a given document_id. |
baaz_view_document |
GET /{index}/document/view?id= |
Fetch a full document by id. |
healthcheck |
GET /healthcheck |
Check that an instance is up and ready. |
list_instances |
(local) | List configured instances — names + base URLs only, never tokens. |
roshan_baaz_docs |
(local) | Offline docs about Baaz and these tools. |
See the Baaz API guide for full request/response field semantics.
Install
Requires Python 3.10+ (developed and tested on 3.11).
git clone https://github.com/roshan-research/roshan-baaz-mcp.git
cd roshan-baaz-mcp
python -m venv .venv && source .venv/bin/activate
pip install -e . # add ".[dev]" for the test/lint toolchain
Run it (stdio transport by default):
python -m roshan_baaz_mcp # or: roshan-baaz-mcp
python -m roshan_baaz_mcp --help # transports, host/port, log level
Configuration (multi-instance)
Configuration is read from environment variables via pydantic-settings.
Shorthand (single instance)
The fastest way to point at one Baaz deployment. This synthesizes an instance
named default:
| Variable | Default | Description |
|---|---|---|
ROSHAN_BAAZ_BASE_URL |
https://baaz.roshan-ai.ir |
Base URL of the Baaz deployment. |
ROSHAN_BAAZ_TOKEN |
(none) | Token sent as Authorization: Token <token>. |
export ROSHAN_BAAZ_BASE_URL=https://baaz.roshan-ai.ir
export ROSHAN_BAAZ_TOKEN=your-secret-token
Nested (one or many named instances)
Prefix ROSHAN_BAAZ__, nested delimiter __. Each instance under
INSTANCES__<NAME>__:
| Variable | Default | Description |
|---|---|---|
ROSHAN_BAAZ__INSTANCES__<NAME>__BASE_URL |
https://baaz.roshan-ai.ir |
Base URL of that instance. |
ROSHAN_BAAZ__INSTANCES__<NAME>__TOKEN |
(none) | Auth token for that instance. |
ROSHAN_BAAZ__INSTANCES__<NAME>__VERIFY_SSL |
true |
Verify the TLS certificate. |
ROSHAN_BAAZ__INSTANCES__<NAME>__TIMEOUT |
60 |
Per-request timeout (seconds). |
ROSHAN_BAAZ__DEFAULT_INSTANCE |
default |
Instance used when a tool omits instance. |
ROSHAN_BAAZ__LOG_LEVEL |
INFO |
Logging level. |
# Two self-hosted regions behind one MCP server
export ROSHAN_BAAZ__INSTANCES__TEHRAN__BASE_URL=https://baaz.tehran.example.ir
export ROSHAN_BAAZ__INSTANCES__TEHRAN__TOKEN=tehran-token
export ROSHAN_BAAZ__INSTANCES__MASHHAD__BASE_URL=https://baaz.mashhad.example.ir
export ROSHAN_BAAZ__INSTANCES__MASHHAD__TOKEN=mashhad-token
export ROSHAN_BAAZ__DEFAULT_INSTANCE=tehran
Tools then target a deployment with instance="mashhad"; omit it to use the
default. Call list_instances to discover configured names (tokens are never
returned).
Use with an MCP client
Add the server to your MCP client (e.g. Claude Desktop) config:
{
"mcpServers": {
"roshan-baaz": {
"command": "python",
"args": ["-m", "roshan_baaz_mcp"],
"env": {
"ROSHAN_BAAZ_BASE_URL": "https://baaz.roshan-ai.ir",
"ROSHAN_BAAZ_TOKEN": "your-secret-token"
}
}
}
}
Architecture
The MCP client talks to roshan-baaz-mcp over a transport (stdio by default).
The server resolves the requested instance, applies guardrails, and calls the
Baaz HTTP API.

Request flow
A typical retrieval session: index documents, run a semantic query, read the ranked results, then optionally expand with similar documents.

Self-hosting & scaling
Because Baaz is self-hosted, one MCP process can front many Baaz deployments,
routing each tool call by its instance argument (region / tenant / environment).

Scaling notes:
- Stateless — the server holds no per-request state; run as many replicas as you need behind your scheduler. Configuration comes entirely from the environment.
- One process, many instances — add instances by setting more
ROSHAN_BAAZ__INSTANCES__<NAME>__*variables; no code changes. - Rate limits — Baaz limits to ~100 requests/minute per deployment; spread
heavy load across instances and back off on
429. - Body cap — requests are capped at ~10MB (enforced client-side before the call); batch large indexing jobs accordingly.
- Horizontal autoscaling — a Kubernetes HPA manifest and Helm chart ship in
deploy/.
Deployment
Ready-to-use manifests live in deploy/:
- Docker —
Dockerfileand a two-instancedocker-compose.yml. - Kubernetes — raw manifests in
deploy/kubernetes/(namespace, configmap, secret, deployment, service, ingress, hpa, kustomization). - Helm — chart in
deploy/helm/roshan-baaz-mcp/. - Terraform — example in
deploy/terraform/.
See deploy/README.md for details.
Testing
pip install -e ".[dev]"
python -m pytest -q # all HTTP is mocked with respx; no network
ruff check src tests
Optional live tests run against a real Baaz deployment when you opt in:
export ROSHAN_BAAZ_LIVE=1
export ROSHAN_BAAZ_BASE_URL=https://baaz.roshan-ai.ir
export ROSHAN_BAAZ_TOKEN=your-secret-token
python -m pytest tests/live -q
There is also a no-network smoke test that asserts every tool has a non-empty
description and an instance parameter:
python scripts/smoke_test.py
Examples
examples/inspect_server.py prints the
registered tools and their schemas without any network call. See
examples/README.md.
License
MIT. Baaz (باز) and Roshan AI are products/trademarks of Roshan AI; this is an unofficial, community-maintained integration built from public docs.
<div align="center">
<img src="assets/icons/roshan.svg" alt="roshan-logo" width="40%"/>
</div>
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.