MCP Servers

MKMChat

AI assistant and MCP server for Mortal Kombat Mobile that provides intelligent team suggestions, mechanic explanations, and context-aware chat via local LLMs and RAG.

README

MKMChat (Mortal Kombat Mobile Assistant)

MKMChat is a premium, local-first AI assistant and Model Context Protocol (MCP) server for Mortal Kombat Mobile. It combines a high-performance Python search backend, advanced RAG retrieval, local Ollama LLM execution, and a stunning Laravel + Livewire web interface.

🌟 Main Features

⚔️ Intelligent Team Suggestion: Assemble custom 3-character team compositions complete with character class analyses, passive synergy ratings, and specific equipment cards recommended for every slot.
📖 Mechanic Explanation Flow: Explain complex gameplay mechanics (e.g., Snare, Power Drain, Oblivion) by generating a clear definition and practical combat recommendations in a structured format.
💬 AI-Powered Conversation Chat: Enjoy natural, context-aware Q&A about game mechanics, strategy, tier rankings, and character matchups with chat history persistence.
🤖 Reasoning Model Support: Optimized dynamically for reasoning models like DeepSeek-R1 and OpenAI o1/o3, adapting parameters (context limits, temperature) for deep analytical outputs.
🛠️ MCP-Compatible Tool Server: Exposes rich game tools directly to LLM clients (like Claude Desktop, Cursor, or AI agents) via Model Context Protocol.
🎨 Harmonious Light & Dark Modes: Responsive, visual-first Laravel Livewire web interface featuring glassmorphic designs, vibrant color schemes, and seamless dark/light theme toggles.

🧠 Advanced Hybrid RAG System

The retrieval pipeline has been heavily upgraded to ensure state-of-the-art relevance and precision:

flowchart TD
    A[Game Data<br>TSV + TXT] --> B[Set-Aware Indexer]
    B --> C[Precise Chunking]
    C --> D[Text Normalization]
    D -- "sentence-transformers<br>all-MiniLM-L6-v2" --> E[Embeddings Cache<br>.rag_cache/]
    E --> F[Hybrid Retrieval<br>Semantic + Keyword Boost]
    F --> G[Ollama Assistant<br>Prompt Assembly]

Set-Aware Indexing: The indexer automatically scans for character set affiliations (e.g., {{Friendship}} or {{Brutality}} tags) and injects mutual cross-references. Retrieving one item naturally surfaces details and names of its set partners (e.g., retrieving Baraka's Horde Chef's Delight also surfaces Horde Chef's Paraphernalia).
True Cosine Similarity: Vector embeddings (both query and document vectors) are mathematically $L_2$-normalized upon creation and query time. Dot-product computation of these normalized vectors yields mathematically precise cosine similarity scores strictly bounded within $[-1.0, 1.0]$.
Hybrid Retrieval (Lexical Keyword Boosting): Vector embeddings (all-MiniLM-L6-v2) are combined with a specialized keyword-matching reranker (_apply_keyword_boost()). Exact matches on character names, rarity tiers, and major gameplay terms receive an intelligent boost, ensuring high semantic recall without losing keyword precision.
Typography Resiliency: Input queries are automatically normalized (e.g., converting curly quotes ’, “, ” to straight quotes ', ") to prevent matching failures caused by different keyboard inputs.
Granular Chunking: Glossary definitions are chunked term-by-term, and gameplay data is chunked line-by-line. This avoids oversized search spaces ("fat chunks") and provides highly targeted context snippets.

🛠️ Architecture

mkmchat/: Python core package (Asynchronous FastAPI + Uvicorn HTTP server, MCP server implementation, and local vector RAG system).
webapp/: Laravel + Livewire web UI consuming the Python API through the secure MkmApiService wrapper.
docker-compose.yml: Full-stack container orchestration linking ollama, python-api (internal network), and webapp (host exposed).

🚀 High-Performance Asynchronous Architecture

The Python API has been completely migrated to a fully asynchronous runtime stack:

Asynchronous Web Core: Replaced custom BaseHTTPRequestHandler with FastAPI running under Uvicorn. All API routes (/suggest-team, /ask-question, /explain-mechanic, /chat, /health, /) run asynchronously (async def) and non-blockingly.
Concurrency Verification: Multiple parallel requests are executed in parallel on Uvicorn's event loop. Under load tests, executing 5 concurrent API requests yields a total wall-clock execution time that matches the latency of a single request (~4.75 seconds), achieving 80%+ concurrency latency savings compared to blocking synchronous designs.
Security & Reliability: Implemented native FastAPI exception handlers for exact HTTP contract safety ({"error": "details"} output format), client IP-based custom sliding rate limiting dependencies, and secure API key authentication headers.

🐳 Running with Docker (Recommended)

1) Prepare Environment Files

From the project root:

cp .env.docker.example .env.docker
cp webapp/.env.docker.example webapp/.env.docker

2) Configure Your Environment

At a minimum, ensure these match where required:

MKM_API_KEY: A strong random string shared by the backend and Laravel services for authentication.
OLLAMA_MODEL: The default model tag to use (e.g., llama3.2:3b or deepseek-r1:14b-fit).
MKM_DEBUG_PROMPTS: Set to true to write detailed, fully redacted LLM prompts and responses to debug_llm.log for easy tuning.

3) Start the Stack

docker compose up -d --build

Dedicated GPU / High VRAM Tuning (WSL & Linux)

To deploy a high-performance DeepSeek-R1 (14B) model optimized for local consumption on GPU-enabled environments:

docker compose exec ollama sh -lc "cat > /tmp/Modelfile.deepseek14b-fit << 'EOF'
FROM deepseek-r1:14b
PARAMETER num_ctx 512
PARAMETER num_batch 32
PARAMETER num_predict 800
PARAMETER use_mmap true
PARAMETER temperature 0.2
EOF
ollama create deepseek-r1:14b-fit -f /tmp/Modelfile.deepseek14b-fit"

Once generated, select deepseek-r1:14b-fit in the web application model selector dropdown!

4) Endpoints

Web Application: http://localhost:8000
Ollama Interface: http://localhost:11434

Note: The Python API is internal to the Docker network (http://python-api:8080) and is not published directly to the host for maximum container-level security.

5) Operations & Logs

docker compose ps               # Check service status
docker compose logs -f python-api  # Live Python server logs
docker compose logs -f webapp      # Live Web UI logs
docker compose logs -f ollama      # Live Ollama inference logs
docker compose down             # Tear down all containers

💻 Running Without Docker

1. Python API & RAG Backend

Ensure you have Python 3.10+ installed:

python -m venv .venv
source .venv/bin/activate   # Linux/macOS
# .venv\Scripts\activate    # Windows

pip install -e .
ollama pull llama3.2:3b
python -m mkmchat http

2. Laravel Web Application

Ensure PHP 8.2+ and Composer are installed:

cd webapp
cp .env.example .env
composer install
npm install
npm run build
php artisan key:generate
php artisan migrate
php artisan serve

⚙️ Environment Variables

Core Variables (`.env.docker`)

Variable	Description	Default
`OLLAMA_BASE_URL`	Base URL of the Ollama server	`http://ollama:11434`
`OLLAMA_MODEL`	Default LLM model tag	`llama3.2:3b`
`MKM_HTTP_HOST`	Host binding for Python API	`0.0.0.0`
`MKM_API_KEY`	Strong bearer authorization key	`change-me-in-production`
`MKM_DEBUG_PROMPTS`	Write fully redacted prompt history to log file	`false`
`MKM_MECHANIC_RAG_TOP_K`	Number of passages to search for explanations	`16`
`MKM_MECHANIC_RAG_MAX_PASSAGES`	Max context passages to include	`8`

🧪 Testing

Run RAG and model connection checks directly inside the containers:

Quick Smoke Test

docker compose exec webapp curl -X POST http://python-api:8080/explain-mechanic \
  -H "Content-Type: application/json" \
  -H "X-API-Key: <your-mkm-api-key>" \
  -d '{"mechanic":"power drain","model":"llama3.2:3b"}'

Python RAG Verification

To manually test semantic search quality, cache validation, and keyword boosting:

# Inside the virtual environment
python tests/test_rag.py

📜 License

This project is licensed under the GNU GPL v3. See the LICENSE file for details.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

MKMChat

README

MKMChat (Mortal Kombat Mobile Assistant)

🌟 Main Features

🧠 Advanced Hybrid RAG System

🛠️ Architecture

🚀 High-Performance Asynchronous Architecture

🐳 Running with Docker (Recommended)

1) Prepare Environment Files

2) Configure Your Environment

3) Start the Stack

Dedicated GPU / High VRAM Tuning (WSL & Linux)

4) Endpoints

5) Operations & Logs

💻 Running Without Docker

1. Python API & RAG Backend

2. Laravel Web Application

⚙️ Environment Variables

Core Variables (.env.docker)

🧪 Testing

Quick Smoke Test

Python RAG Verification

📜 License

Recommended Servers

Core Variables (`.env.docker`)