MKMChat

MKMChat

AI assistant and MCP server for Mortal Kombat Mobile that provides intelligent team suggestions, mechanic explanations, and context-aware chat via local LLMs and RAG.

Category
Visit Server

README

MKMChat (Mortal Kombat Mobile Assistant)

MKMChat is a premium, local-first AI assistant and Model Context Protocol (MCP) server for Mortal Kombat Mobile. It combines a high-performance Python search backend, advanced RAG retrieval, local Ollama LLM execution, and a stunning Laravel + Livewire web interface.


๐ŸŒŸ Main Features

  • โš”๏ธ Intelligent Team Suggestion: Assemble custom 3-character team compositions complete with character class analyses, passive synergy ratings, and specific equipment cards recommended for every slot.
  • ๐Ÿ“– Mechanic Explanation Flow: Explain complex gameplay mechanics (e.g., Snare, Power Drain, Oblivion) by generating a clear definition and practical combat recommendations in a structured format.
  • ๐Ÿ’ฌ AI-Powered Conversation Chat: Enjoy natural, context-aware Q&A about game mechanics, strategy, tier rankings, and character matchups with chat history persistence.
  • ๐Ÿค– Reasoning Model Support: Optimized dynamically for reasoning models like DeepSeek-R1 and OpenAI o1/o3, adapting parameters (context limits, temperature) for deep analytical outputs.
  • ๐Ÿ› ๏ธ MCP-Compatible Tool Server: Exposes rich game tools directly to LLM clients (like Claude Desktop, Cursor, or AI agents) via Model Context Protocol.
  • ๐ŸŽจ Harmonious Light & Dark Modes: Responsive, visual-first Laravel Livewire web interface featuring glassmorphic designs, vibrant color schemes, and seamless dark/light theme toggles.

๐Ÿง  Advanced Hybrid RAG System

The retrieval pipeline has been heavily upgraded to ensure state-of-the-art relevance and precision:

flowchart TD
    A[Game Data<br>TSV + TXT] --> B[Set-Aware Indexer]
    B --> C[Precise Chunking]
    C --> D[Text Normalization]
    D -- "sentence-transformers<br>all-MiniLM-L6-v2" --> E[Embeddings Cache<br>.rag_cache/]
    E --> F[Hybrid Retrieval<br>Semantic + Keyword Boost]
    F --> G[Ollama Assistant<br>Prompt Assembly]
  1. Set-Aware Indexing: The indexer automatically scans for character set affiliations (e.g., {{Friendship}} or {{Brutality}} tags) and injects mutual cross-references. Retrieving one item naturally surfaces details and names of its set partners (e.g., retrieving Baraka's Horde Chef's Delight also surfaces Horde Chef's Paraphernalia).
  2. True Cosine Similarity: Vector embeddings (both query and document vectors) are mathematically $L_2$-normalized upon creation and query time. Dot-product computation of these normalized vectors yields mathematically precise cosine similarity scores strictly bounded within $[-1.0, 1.0]$.
  3. Hybrid Retrieval (Lexical Keyword Boosting): Vector embeddings (all-MiniLM-L6-v2) are combined with a specialized keyword-matching reranker (_apply_keyword_boost()). Exact matches on character names, rarity tiers, and major gameplay terms receive an intelligent boost, ensuring high semantic recall without losing keyword precision.
  4. Typography Resiliency: Input queries are automatically normalized (e.g., converting curly quotes โ€™, โ€œ, โ€ to straight quotes ', ") to prevent matching failures caused by different keyboard inputs.
  5. Granular Chunking: Glossary definitions are chunked term-by-term, and gameplay data is chunked line-by-line. This avoids oversized search spaces ("fat chunks") and provides highly targeted context snippets.

๐Ÿ› ๏ธ Architecture

  • mkmchat/: Python core package (Asynchronous FastAPI + Uvicorn HTTP server, MCP server implementation, and local vector RAG system).
  • webapp/: Laravel + Livewire web UI consuming the Python API through the secure MkmApiService wrapper.
  • docker-compose.yml: Full-stack container orchestration linking ollama, python-api (internal network), and webapp (host exposed).

๐Ÿš€ High-Performance Asynchronous Architecture

The Python API has been completely migrated to a fully asynchronous runtime stack:

  • Asynchronous Web Core: Replaced custom BaseHTTPRequestHandler with FastAPI running under Uvicorn. All API routes (/suggest-team, /ask-question, /explain-mechanic, /chat, /health, /) run asynchronously (async def) and non-blockingly.
  • Concurrency Verification: Multiple parallel requests are executed in parallel on Uvicorn's event loop. Under load tests, executing 5 concurrent API requests yields a total wall-clock execution time that matches the latency of a single request (~4.75 seconds), achieving 80%+ concurrency latency savings compared to blocking synchronous designs.
  • Security & Reliability: Implemented native FastAPI exception handlers for exact HTTP contract safety ({"error": "details"} output format), client IP-based custom sliding rate limiting dependencies, and secure API key authentication headers.

๐Ÿณ Running with Docker (Recommended)

1) Prepare Environment Files

From the project root:

cp .env.docker.example .env.docker
cp webapp/.env.docker.example webapp/.env.docker

2) Configure Your Environment

At a minimum, ensure these match where required:

  • MKM_API_KEY: A strong random string shared by the backend and Laravel services for authentication.
  • OLLAMA_MODEL: The default model tag to use (e.g., llama3.2:3b or deepseek-r1:14b-fit).
  • MKM_DEBUG_PROMPTS: Set to true to write detailed, fully redacted LLM prompts and responses to debug_llm.log for easy tuning.

3) Start the Stack

docker compose up -d --build

Dedicated GPU / High VRAM Tuning (WSL & Linux)

To deploy a high-performance DeepSeek-R1 (14B) model optimized for local consumption on GPU-enabled environments:

docker compose exec ollama sh -lc "cat > /tmp/Modelfile.deepseek14b-fit << 'EOF'
FROM deepseek-r1:14b
PARAMETER num_ctx 512
PARAMETER num_batch 32
PARAMETER num_predict 800
PARAMETER use_mmap true
PARAMETER temperature 0.2
EOF
ollama create deepseek-r1:14b-fit -f /tmp/Modelfile.deepseek14b-fit"

Once generated, select deepseek-r1:14b-fit in the web application model selector dropdown!

4) Endpoints

Note: The Python API is internal to the Docker network (http://python-api:8080) and is not published directly to the host for maximum container-level security.

5) Operations & Logs

docker compose ps               # Check service status
docker compose logs -f python-api  # Live Python server logs
docker compose logs -f webapp      # Live Web UI logs
docker compose logs -f ollama      # Live Ollama inference logs
docker compose down             # Tear down all containers

๐Ÿ’ป Running Without Docker

1. Python API & RAG Backend

Ensure you have Python 3.10+ installed:

python -m venv .venv
source .venv/bin/activate   # Linux/macOS
# .venv\Scripts\activate    # Windows

pip install -e .
ollama pull llama3.2:3b
python -m mkmchat http

2. Laravel Web Application

Ensure PHP 8.2+ and Composer are installed:

cd webapp
cp .env.example .env
composer install
npm install
npm run build
php artisan key:generate
php artisan migrate
php artisan serve

โš™๏ธ Environment Variables

Core Variables (.env.docker)

Variable Description Default
OLLAMA_BASE_URL Base URL of the Ollama server http://ollama:11434
OLLAMA_MODEL Default LLM model tag llama3.2:3b
MKM_HTTP_HOST Host binding for Python API 0.0.0.0
MKM_API_KEY Strong bearer authorization key change-me-in-production
MKM_DEBUG_PROMPTS Write fully redacted prompt history to log file false
MKM_MECHANIC_RAG_TOP_K Number of passages to search for explanations 16
MKM_MECHANIC_RAG_MAX_PASSAGES Max context passages to include 8

๐Ÿงช Testing

Run RAG and model connection checks directly inside the containers:

Quick Smoke Test

docker compose exec webapp curl -X POST http://python-api:8080/explain-mechanic \
  -H "Content-Type: application/json" \
  -H "X-API-Key: <your-mkm-api-key>" \
  -d '{"mechanic":"power drain","model":"llama3.2:3b"}'

Python RAG Verification

To manually test semantic search quality, cache validation, and keyword boosting:

# Inside the virtual environment
python tests/test_rag.py

๐Ÿ“œ License

This project is licensed under the GNU GPL v3. See the LICENSE file for details.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured