MCP Servers

MCP Prompt Tester

An MCP server that allows agents to test and compare LLM prompts across OpenAI and Anthropic models, supporting single tests, side-by-side comparisons, and multi-turn conversations.

rt96-hub

Developer Tools

Visit Server

README

MCP Prompt Tester

A simple MCP server that allows agents to test LLM prompts with different providers.

Features

Test prompts with OpenAI and Anthropic models
Configure system prompts, user prompts, and other parameters
Get formatted responses or error messages
Easy environment setup with .env file support

Installation

# Install with pip
pip install -e .

# Or with uv
uv install -e .

API Key Setup

The server requires API keys for the providers you want to use. You can set these up in two ways:

Option 1: Environment Variables

Set the following environment variables:

OPENAI_API_KEY - Your OpenAI API key
ANTHROPIC_API_KEY - Your Anthropic API key

Option 2: .env File (Recommended)

Create a file named .env in your project directory or home directory
Add your API keys in the following format:

OPENAI_API_KEY=your-openai-api-key-here
ANTHROPIC_API_KEY=your-anthropic-api-key-here

The server will automatically detect and load these keys

For convenience, a sample template is included as .env.example.

Usage

Start the server using stdio (default) or SSE transport:

# Using stdio transport (default)
prompt-tester

# Using SSE transport on custom port
prompt-tester --transport sse --port 8000

Available Tools

The server exposes the following tools for MCP-empowered agents:

1. list_providers

Retrieves available LLM providers and their default models.

Parameters:

None required

Example Response:

{
  "providers": {
    "openai": [
      {
        "type": "gpt-4",
        "name": "gpt-4",
        "input_cost": 0.03,
        "output_cost": 0.06,
        "description": "Most capable GPT-4 model"
      },
      // ... other models ...
    ],
    "anthropic": [
      // ... models ...
    ]
  }
}

2. test_comparison

Compares multiple prompts side-by-side, allowing you to test different providers, models, and parameters simultaneously.

Parameters:

comparisons (array): A list of 1-4 comparison configurations, each containing:
- provider (string): The LLM provider to use ("openai" or "anthropic")
- model (string): The model name
- system_prompt (string): The system prompt (instructions for the model)
- user_prompt (string): The user's message/prompt
- temperature (number, optional): Controls randomness
- max_tokens (integer, optional): Maximum number of tokens to generate
- top_p (number, optional): Controls diversity via nucleus sampling

Example Usage:

{
  "comparisons": [
    {
      "provider": "openai",
      "model": "gpt-4",
      "system_prompt": "You are a helpful assistant.",
      "user_prompt": "Explain quantum computing in simple terms.",
      "temperature": 0.7
    },
    {
      "provider": "anthropic",
      "model": "claude-3-opus-20240229",
      "system_prompt": "You are a helpful assistant.",
      "user_prompt": "Explain quantum computing in simple terms.",
      "temperature": 0.7
    }
  ]
}

3. test_multiturn_conversation

Manages multi-turn conversations with LLM providers, allowing you to create and maintain stateful conversations.

Modes:

start: Begins a new conversation
continue: Continues an existing conversation
get: Retrieves conversation history
list: Lists all active conversations
close: Closes a conversation

Parameters:

mode (string): Operation mode ("start", "continue", "get", "list", or "close")
conversation_id (string): Unique ID for the conversation (required for continue, get, close modes)
provider (string): The LLM provider (required for start mode)
model (string): The model name (required for start mode)
system_prompt (string): The system prompt (required for start mode)
user_prompt (string): The user message (used in start and continue modes)
temperature (number, optional): Temperature parameter for the model
max_tokens (integer, optional): Maximum tokens to generate
top_p (number, optional): Top-p sampling parameter

Example Usage (Starting a Conversation):

{
  "mode": "start",
  "provider": "openai",
  "model": "gpt-4",
  "system_prompt": "You are a helpful assistant specializing in physics.",
  "user_prompt": "Can you explain what dark matter is?"
}

Example Usage (Continuing a Conversation):

{
  "mode": "continue",
  "conversation_id": "conv_12345",
  "user_prompt": "How does that relate to dark energy?"
}

Example Usage for Agents

Using the MCP client, an agent can use the tools like this:

import asyncio
import json
from mcp.client.session import ClientSession
from mcp.client.stdio import StdioServerParameters, stdio_client

async def main():
    async with stdio_client(
        StdioServerParameters(command="prompt-tester")
    ) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            
            # 1. List available providers and models
            providers_result = await session.call_tool("list_providers", {})
            print("Available providers and models:", providers_result)
            
            # 2. Run a basic test with a single model and prompt
            comparison_result = await session.call_tool("test_comparison", {
                "comparisons": [
                    {
                        "provider": "openai",
                        "model": "gpt-4",
                        "system_prompt": "You are a helpful assistant.",
                        "user_prompt": "Explain quantum computing in simple terms.",
                        "temperature": 0.7,
                        "max_tokens": 500
                    }
                ]
            })
            print("Single model test result:", comparison_result)
            
            # 3. Compare multiple prompts/models side by side
            comparison_result = await session.call_tool("test_comparison", {
                "comparisons": [
                    {
                        "provider": "openai",
                        "model": "gpt-4",
                        "system_prompt": "You are a helpful assistant.",
                        "user_prompt": "Explain quantum computing in simple terms.",
                        "temperature": 0.7
                    },
                    {
                        "provider": "anthropic",
                        "model": "claude-3-opus-20240229",
                        "system_prompt": "You are a helpful assistant.",
                        "user_prompt": "Explain quantum computing in simple terms.",
                        "temperature": 0.7
                    }
                ]
            })
            print("Comparison result:", comparison_result)
            
            # 4. Start a multi-turn conversation
            conversation_start = await session.call_tool("test_multiturn_conversation", {
                "mode": "start",
                "provider": "openai",
                "model": "gpt-4",
                "system_prompt": "You are a helpful assistant specializing in physics.",
                "user_prompt": "Can you explain what dark matter is?"
            })
            print("Conversation started:", conversation_start)
            
            # Get the conversation ID from the response
            response_data = json.loads(conversation_start.text)
            conversation_id = response_data.get("conversation_id")
            
            # Continue the conversation
            if conversation_id:
                conversation_continue = await session.call_tool("test_multiturn_conversation", {
                    "mode": "continue",
                    "conversation_id": conversation_id,
                    "user_prompt": "How does that relate to dark energy?"
                })
                print("Conversation continued:", conversation_continue)
                
                # Get the conversation history
                conversation_history = await session.call_tool("test_multiturn_conversation", {
                    "mode": "get",
                    "conversation_id": conversation_id
                })
                print("Conversation history:", conversation_history)

asyncio.run(main())

MCP Agent Integration

For MCP-empowered agents, integration is straightforward. When your agent needs to test LLM prompts:

Discovery: The agent can use list_providers to discover available models and their capabilities
Simple Testing: For quick tests, use the test_comparison tool with a single configuration
Comparison: When the agent needs to evaluate different prompts or models, it can use test_comparison with multiple configurations
Stateful Interactions: For multi-turn conversations, the agent can manage a conversation using the test_multiturn_conversation tool

This allows agents to:

Test prompt variants to find the most effective phrasing
Compare different models for specific tasks
Maintain context in multi-turn conversations
Optimize parameters like temperature and max_tokens
Track token usage and costs during development

Configuration

You can set API keys and optional tracing configurations using environment variables:

Required API Keys

OPENAI_API_KEY - Your OpenAI API key
ANTHROPIC_API_KEY - Your Anthropic API key

Optional Langfuse Tracing

The server supports Langfuse for tracing and observability of LLM calls. These settings are optional:

LANGFUSE_SECRET_KEY - Your Langfuse secret key
LANGFUSE_PUBLIC_KEY - Your Langfuse public key
LANGFUSE_HOST - URL of your Langfuse instance

If you don't want to use Langfuse tracing, simply leave these settings empty.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

@kazuph/mcp-taskmanager

Model Context Protocol server for Task Management. This allows Claude Desktop (or any MCP client) to manage and execute tasks in a queue-based system.

Featured

Local

JavaScript

Claude Code MCP

An implementation of Claude Code as a Model Context Protocol server that enables using Claude's software engineering capabilities (code generation, editing, reviewing, and file operations) through the standardized MCP interface.

Featured

Local

JavaScript

MCP Package Docs Server

Facilitates LLMs to efficiently access and fetch structured documentation for packages in Go, Python, and NPM, enhancing software development with multi-language support and performance optimization.

Featured

Local

TypeScript

Linear MCP Server

A Model Context Protocol server that integrates with Linear's issue tracking system, allowing LLMs to create, update, search, and comment on Linear issues through natural language interactions.

Featured

JavaScript

Sequential Thinking MCP Server

This server facilitates structured problem-solving by breaking down complex issues into sequential steps, supporting revisions, and enabling multiple solution paths through full MCP integration.

Featured

Python

mermaid-mcp-server

A Model Context Protocol (MCP) server that converts Mermaid diagrams to PNG images.

Featured

JavaScript

Jira-Context-MCP

MCP server to provide Jira Tickets information to AI coding agents like Cursor

Featured

TypeScript

Linear MCP Server

Enables interaction with Linear's API for managing issues, teams, and projects programmatically through the Model Context Protocol.

Featured

JavaScript