
MCP Waifu Queue
An MCP server that implements a conversational AI 'waifu' character using a text generation service with Redis queuing and GPU acceleration.
README
MCP Waifu Queue
This project implements an MCP (Model Context Protocol) server for a conversational AI "waifu" character, leveraging a text generation service with a Redis queue and GPU acceleration. It utilizes the FastMCP
library for simplified server setup and management.
Table of Contents
- Features
- Architecture
- Prerequisites
- Installation
- Configuration
- Running the Service
- MCP API
- Testing
- Troubleshooting
- Contributing
- License
Features
- Text generation using the distilgpt2 language model.
- Request queuing using Redis for handling concurrent requests.
- GPU support for faster inference.
- MCP-compliant API using
FastMCP
. - Job status tracking.
Architecture
The project consists of several key components:
main.py
: The main entry point, initializing theFastMCP
application.respond.py
: The core text generation service, loading the distilgpt2 model and generating responses.queue.py
: Handles interactions with the Redis queue, enqueuing requests and managing job IDs.worker.py
: A Redis worker that processes jobs from the queue, utilizingrespond.py
for text generation.config.py
: Manages configuration via environment variables.models.py
: Defines Pydantic models for request and response validation.
The flow of a request is as follows:
- A client sends a request to the
generate_text
MCP tool (defined inmain.py
). - The tool enqueues the request to a Redis queue (handled by
queue.py
). - A
worker.py
process picks up the request from the queue. - The worker calls the
call_predict_response
function (inutils.py
), which interacts withrespond.py
to generate the text. - The generated text is stored, and the job status is updated.
- The client can retrieve the result using the
get_job_status
resource (defined inmain.py
).
Prerequisites
- Python 3.7+
- pip
- Redis server (installed and running)
- A CUDA-enabled GPU (optional, but recommended for performance)
You can find instructions for installing Redis on your system on the official Redis website: https://redis.io/docs/getting-started/
Installation
-
Clone the repository:
git clone <YOUR_REPOSITORY_URL> cd mcp-waifu-queue
-
Create and activate a virtual environment:
python3 -m venv venv source venv/bin/activate # On Linux/macOS venv\Scripts\activate # On Windows
-
Install dependencies:
pip install --user -r requirements.txt #If requirements.txt exists
Or, if using pyproject.toml
pip install --user -e .
Configuration
-
Copy the
.env.example
file to.env
:cp .env.example .env
-
Modify the
.env
file to set the appropriate values for your environment. The following environment variables are available:MODEL_PATH
: The path to the pre-trained language model (default:distilgpt2
).GPU_SERVICE_URL
: The URL of the GPU service (default:http://localhost:5001
). This is used internally by the worker.REDIS_URL
: The URL of the Redis server (default:redis://localhost:6379
).QUEUE_PORT
: The port for the queue service (default:5000
). This is no longer directly used for external access, as we're using MCP.RESPOND_PORT
: The port for the response service (default:5001
). This is used internally by the worker.MAX_NEW_TOKENS
: The maximum number of new tokens to generate (default: 20).
Note: The
.env
file should not be committed to the repository for security reasons.
Running the Service
Start the services using the scripts/start-services.sh
script:
./scripts/start-services.sh
This script will start the Redis server (if not already running), the worker, the queue service, and the response service. The services will run in the background.
MCP API
The server provides the following MCP-compliant endpoints:
Tools
generate_text
(prompt: str): Sends a text generation request and returns a job ID.
Resources
job://{job_id}
: Retrieves the status of a job. The response will include astatus
field (e.g., "queued", "processing", "completed", "failed") and, if completed, aresult
field containing the generated text.
Testing
The project includes tests. You can run all tests using pytest
:
pytest tests
Troubleshooting
- Error: "Missing 'prompt' parameter": Make sure you are sending a prompt string to the
generate_text
tool. - Error: "Error calling GPU service": Ensure that the
respond.py
service is running and accessible at the configuredGPU_SERVICE_URL
. - Error: "Service unavailable": Check if Redis server, worker, queue and respond services are running.
- If encountering CUDA errors: Ensure your CUDA drivers and toolkit are correctly installed and compatible with your PyTorch version.
Contributing
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Commit your changes.
- Push your branch to your forked repository.
- Create a pull request.
Please adhere to the project's code of conduct.
License
This project is licensed under the MIT-0 License - see the LICENSE file for details.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.