Coval MCP Server
Enables AI assistants to interact with Coval's evaluation platform for launching and monitoring evaluation runs, managing agents and test sets, and retrieving evaluation metrics.
README
Coval MCP Server
The official Model Context Protocol server for Coval - the AI evaluation platform.
This MCP server allows AI assistants like Claude Desktop and Cursor to interact with Coval's evaluation APIs, enabling you to:
- Launch and monitor evaluation runs
- Manage AI agents and test sets
- Retrieve evaluation metrics and results
Installation
npx @covalai/mcp-server
Quick Start
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"coval": {
"command": "npx",
"args": ["-y", "@covalai/mcp-server"],
"env": {
"COVAL_API_KEY": "your_api_key_here"
}
}
}
}
Cursor
Add to .cursor/mcp.json in your project:
{
"mcpServers": {
"coval": {
"command": "npx",
"args": ["-y", "@covalai/mcp-server"],
"env": {
"COVAL_API_KEY": "your_api_key_here"
}
}
}
}
Remote Connection (Alternative)
{
"mcpServers": {
"coval": {
"command": "npx",
"args": [
"-y",
"mcp-remote",
"https://mcp.coval.dev/mcp",
"--header",
"X-API-KEY: ${COVAL_API_KEY}"
],
"env": {
"COVAL_API_KEY": "your_api_key_here"
}
}
}
}
Get your API key from app.coval.dev/settings
Available Tools
| Tool | Description |
|---|---|
list_agents |
List all agents in your workspace |
get_agent |
Get details of a specific agent |
list_runs |
List evaluation runs |
get_run |
Get details of a specific run |
create_run |
Start a new evaluation run |
list_test_sets |
List available test sets |
get_test_set |
Get test set details |
list_test_cases |
List test cases in a test set |
create_test_case |
Add a test case to a test set |
get_metrics |
Get metrics for a run |
list_personas |
List available personas |
Example Usage
Once connected, you can ask Claude things like:
"Show me my recent evaluation runs"
"List all my agents"
"Run an evaluation of my customer-support-agent against the billing-inquiries test set"
"What are the metrics for run abc123?"
Development
# Install dependencies
npm install
# Build
npm run build
# Test locally with MCP Inspector
npm run inspector
# Run tests
npm test
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
COVAL_API_KEY |
Yes | - | Your Coval API key |
COVAL_API_BASE_URL |
No | https://api.coval.dev/v1 |
API base URL |
LOG_LEVEL |
No | info |
Logging level |
Documentation
License
MIT
Support
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.