A-Modular-Kingdom

A-Modular-Kingdom

Production-ready MCP server providing RAG, hierarchical memory, and 8+ tools for AI agents via the Model Context Protocol.

Category
Visit Server

README

๐Ÿฐ A-Modular-Kingdom

Production-ready MCP server with RAG, memory, and tools

Stop rebuilding the same infrastructure. Connect any AI agent to long-term memory, document retrieval, and 8+ powerful tools through the Model Context Protocol.

The Problem

Building AI agents? You keep reinventing:

  • Long-term memory that persists across sessions
  • Document retrieval (RAG) for knowledge access
  • Tool integration (web search, vision, code execution, browser automation)

Every project starts from scratch. Every agent rebuilds the wheel.

The Solution

A-Modular-Kingdom is the infrastructure layer you're missing:

# Start the MCP server
python src/agent/host.py

Now any agent (Claude Desktop, custom chatbots, multi-agent systems) gets instant access to:

  • โœ… Hierarchical memory (global rules, project context)
  • โœ… 3 RAG implementations (v1/v2/v3) for document search
  • โœ… 8 production-ready tools via MCP protocol

One foundation. Infinite applications.


๐Ÿ“‘ Table of Contents


โœจ Core Features

  • MCP Protocol - Standard interface for AI tool access
  • 3 RAG Versions - Choose your retrieval strategy (FAISS, Qdrant, custom)
  • Scoped Memory - Global rules, preferences, project-specific context
  • 8+ Tools - Vision, code exec, browser, web search, TTS/STT, and more
  • No Vendor Lock-in - Local Ollama models, open-source stack
  • Production Ready - Smart reindexing, Unicode support, error handling

๐Ÿš€ Quick Start

Prerequisites

# Required
Python 3.10+
Ollama (for embeddings: ollama pull embeddinggemma)

# Optional
UV package manager (faster than pip)

Installation

# Clone the repository
git clone https://github.com/MasihMoafi/A-Modular-Kingdom.git
cd A-Modular-Kingdom

# Install dependencies
uv sync
# or: pip install -r requirements.txt

# Pull required Ollama model
ollama pull embeddinggemma

Start the MCP Server

# Start host.py MCP server
python src/agent/host.py

Connect Your Agent

Option 1: Claude Desktop

// Add to claude_desktop_config.json
{
  "mcpServers": {
    "a-modular-kingdom": {
      "command": "python",
      "args": ["/full/path/to/A-Modular-Kingdom/src/agent/host.py"]
    }
  }
}

Option 2: Interactive Client

# Use the included chat interface
python src/agent/main.py

Option 3: Custom Integration

# Connect via MCP in your own agent
from mcp import StdioServerParameters

server_params = StdioServerParameters(
    command="python",
    args=["/path/to/host.py"]
)
# Use with ToolCollection.from_mcp(server_params)

๐Ÿ› ๏ธ Available Tools

The MCP server exposes these tools:

Tool Description Use Case
query_knowledge_base RAG search (v1/v2/v3) "How does auth work in this codebase?"
save_memory Scoped memory storage Save global rules or project context
search_memories Semantic memory search Retrieve past decisions/preferences
web_search DuckDuckGo search Current events, latest docs
browser_automation Playwright web scraping Extract text/screenshot from URLs
code_execute Safe Python sandbox Run code in isolated environment
analyze_media Vision with Ollama Analyze images/videos
text_to_speech TTS (pyttsx3/kokoro) Generate audio from text
speech_to_text Whisper STT Transcribe audio files

๐Ÿ“š RAG System

Three implementations with different trade-offs:

V1 - Simple & Fast

  • Stack: FAISS + BM25
  • Speed: <1s
  • Use Case: Small projects, quick prototypes

V2 - Production (Recommended)

  • Stack: Qdrant + BM25 + CrossEncoder reranking
  • Speed: <1s with smart caching
  • Use Case: Production apps, large codebases
  • Features: Smart reindexing, cloud-ready

V3 - Advanced

  • Stack: Custom vector index + BM25 + RRF fusion + LLM reranking
  • Speed: 2-3s (LLM reranking overhead)
  • Use Case: Research, maximum accuracy
  • Features: Contextual retrieval, custom distance metrics

Usage:

# Via MCP tool
query_knowledge_base(
    query="How does authentication work?",
    version="v2",  # or "v1", "v3"
    doc_path="./src"  # optional
)

Supported Files: .py, .md, .txt, .pdf, .ipynb, .js, .ts


๐Ÿง  Memory System

Hierarchical scoped memory with automatic categorization:

Memory Scopes

Scope Persistence Use Case
Global Rules Forever, all projects "Always use type hints"
Global Preferences Forever, all projects "Prefer dark mode"
Global Personas Forever, all projects Reusable agent personalities
Project Context Current project Architecture decisions, tech stack
Project Sessions Temporary Current task, recent changes

Usage

# Save with explicit scope
save_memory(content="Always validate user input", scope="global_rules")

# Or use prefix shortcuts
save_memory(content="#global:rule:Never use eval()")
save_memory(content="#project:context:Uses FastAPI backend")

# Auto-inference from keywords
save_memory(content="User prefers Python 3.12")  # โ†’ global_preferences

# Search with priority (global โ†’ project)
search_memories(query="coding standards", top_k=5)

Storage: ~/.modular_kingdom/memories/ (global) + project-specific folders


๐Ÿ“ฆ Package Installation

The MCP server can also be installed as a standalone package:

# Install with sentence-transformers (no Ollama required)
pip install rag-mem[local]

# Set embedding provider (add to your shell profile or script)
export MEMORY_MCP_EMBED_PROVIDER=sentence-transformers
export MEMORY_MCP_EMBED_MODEL=all-MiniLM-L6-v2

Python API:

from memory_mcp.config import Settings
from memory_mcp.rag import RAGPipeline
from memory_mcp.memory import MemoryStore

# RAG - index and search any codebase
pipeline = RAGPipeline(Settings(), document_paths=["./src"])
pipeline.index()
results = pipeline.search("how does authentication work")

# Memory - persistent storage across sessions
store = MemoryStore(Settings())
store.add("User prefers dark mode")
results = store.search("preferences")

CLI Usage:

memory-mcp init                      # Initialize config
memory-mcp serve --docs ./documents  # Start MCP server
memory-mcp index ./path/to/files     # Index documents

Alternative: Use Ollama (local, private)

pip install rag-mem
ollama pull nomic-embed-text
# No env vars needed - Ollama is the default

Package Size: 58KB code (note: ~2GB dependencies with PyTorch)


๐ŸŽฏ Integration Examples

Claude Desktop

Already using Claude Code? Add A-Modular-Kingdom tools:

{
  "mcpServers": {
    "a-modular-kingdom": {
      "command": "python",
      "args": ["/path/to/src/agent/host.py"]
    }
  }
}

Now Claude has access to your codebase RAG, persistent memory, and all tools.

Gemini CLI

// gemini-extension.json
{
  "mcpServers": {
    "unified_knowledge_agent": {
      "command": "python",
      "args": ["/path/to/src/agent/host.py"]
    }
  }
}

Custom Agent

from smolagents import ToolCallingAgent, ToolCollection
from mcp import StdioServerParameters

# Connect to MCP server
params = StdioServerParameters(
    command="python",
    args=["/path/to/host.py"]
)

with ToolCollection.from_mcp(params) as tools:
    agent = ToolCallingAgent(tools=list(tools.tools))
    result = agent.run("Search the codebase for auth logic")

๐Ÿค– Example Applications

This repository includes example multi-agent systems built on the foundation:

Council Chamber (Hierarchical)

  • 3-tier agent hierarchy (Queen โ†’ Teacher โ†’ Code Agent)
  • Validation loops and task delegation
  • Uses ACP SDK + smolagents
  • Location: multiagents/council_chamber/

Note: These are demonstration applications, not the core product. The foundation (host.py) is the main offering.


๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚     Your AI Application             โ”‚
โ”‚  (Agents, Chatbots, Workflows)      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
             โ”‚ MCP Protocol
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚      A-Modular-Kingdom              โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”โ”‚
โ”‚  โ”‚   RAG   โ”‚ โ”‚ Memory  โ”‚ โ”‚ Tools  โ”‚โ”‚
โ”‚  โ”‚ V1/V2/V3โ”‚ โ”‚ Scoped  โ”‚ โ”‚ 8+     โ”‚โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜โ”‚
โ”‚           host.py (MCP Server)      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿงช Testing & Performance

Run Tests

# Run all tests
pytest tests/ -v

# Run specific test suites
pytest tests/test_rag_v2.py -v
pytest tests/test_rag_v3.py -v
pytest tests/test_memory_global.py -v

# Run benchmarks
python tests/benchmark_rag.py

Performance

Benchmark Results (GPU/CUDA):

Version Docs Cold Start Warm Query
V2 100 26.8s 0.31s
V3 100 13.9s 0.02s (15x faster!)

Key Features:

  • โœ… GPU acceleration (CUDA) for embeddings and reranking
  • โœ… Smart caching (warm queries <0.5s)
  • โœ… Tested with .py, .md, .txt, .ipynb files
  • โœ… Global memory access from any directory

See detailed benchmarks: docs/RAG_PERFORMANCE.md

Docker Testing

Package verified to work in isolation:

docker build -f Dockerfile.test -t rag-mem-test .
docker run --rm rag-mem-test

๐Ÿค Contributing

Contributions welcome! Focus areas:

  1. Additional RAG strategies - New retrieval techniques
  2. New tool integrations - Expand MCP tool offerings
  3. Performance optimizations - Speed improvements
  4. Documentation improvements - Tutorials, examples

Development Setup

# Fork and clone
git clone https://github.com/MasihMoafi/A-Modular-Kingdom.git
cd A-Modular-Kingdom

# Create branch
git checkout -b feature/your-feature

# Install dev dependencies
uv sync

# Make changes and test
pytest tests/

# Commit with descriptive message
git commit -m "feat: add new tool"

# Push and create PR
git push origin feature/your-feature

๐Ÿ“œ License

MIT License - See LICENSE for details


Links

  • Medium Article: https://medium.com/@masihmoafi12/a-modular-kingdom-fcaa69a6c1f0
  • Demo Video: https://www.youtube.com/watch?v=hWoQnAr6R_E
  • PyPI Package: rag-mem

A-Modular-Kingdom: The infrastructure layer AI agents deserve ๐Ÿฐ

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured