voting-mcp

voting-mcp

Principled social-choice aggregation as MCP tools — with a benchmark that measures the accuracy lift over naive majority vote.

Category
Visit Server

README

voting-mcp

Principled social-choice aggregation as MCP tools — with a benchmark that measures the accuracy lift over naive majority vote.

Almost every multi-agent system aggregates votes with Counter(votes).most_common(1), throwing away preference order and confidence. voting-mcp ships the real rules (Borda, Copeland, Condorcet, approval, STV, linear opinion pool) as callable MCP tools — each with its known axiomatic behavior and explicit, documented tie-breaking — plus a reproducible benchmark that aggregates a diverse ensemble of LLMs on a reasoning set and reports accuracy with bootstrap confidence intervals.

The server is pure compute: stdio transport, no network, no file writes, no secrets — clean against the OWASP MCP Top 10 by construction.

Install

# run the server directly (once published)
uvx voting-mcp

# or from source
git clone https://github.com/HrishiKabra/voting-mcp && cd voting-mcp
uv sync
uv run python -m voting_mcp.server

Add it to an MCP client (e.g. Claude Desktop claude_desktop_config.json):

{
  "mcpServers": {
    "voting": { "command": "uvx", "args": ["voting-mcp"] }
  }
}

Tools

Every tool takes a profile ({candidates, ballots}) and returns a Result with the full co-winner set (winners, so ties are never hidden), the single tie-broken winner (or null when none exists), a ranking, per-candidate scores, and a note.

Tool Ballots Notes
borda rankings positional; Condorcet-inconsistent, clone-sensitive
copeland rankings Condorcet-consistent pairwise (+1 win, +0.5 tie)
condorcet rankings returns the pairwise winner or an explicit no-winner on a cycle
approval approval sets most-approved wins
stv rankings single-winner instant-runoff; clone-resistant
opinion_pool distributions linear pool — preserves confidence, not an argmax vote
plurality rankings baseline (most first choices)
majority rankings strict >50% or no winner
aggregate_rule any dispatch by a rule enum

Tie-breaking is an explicit parameter (lexicographic default, none, or seeded random).

Benchmark

Aggregate an ensemble of 5 models (one OpenAI-compatible client via OpenRouter) on ARC-Challenge and compare each rule to the naive majority vote:

uv sync --extra bench
uv run python -m bench.fetch_arc --limit 200
# prints a cost estimate and STOPS; add --yes to actually call the API, --mock for a free dry run
uv run python -m bench.run_ensemble --dataset bench/datasets/arc_challenge.jsonl --limit 200 --yes
uv run python -m bench.compare --dataset bench/datasets/arc_challenge.jsonl --limit 200

Every raw response is cached under bench/results/raw/; re-runs never re-call the API, so aggregation tweaks are free.

Results

5-model ensemble (gpt-4o-mini · gemini-2.5-flash-lite · deepseek-v3 · claude-haiku-4.5 · glm-4.7), n = 200, bootstrap 95% CI. Two datasets of different difficulty; full write-up and both plots in RESULTS.md.

MMLU-Pro (hard, baseline 73.5%) — the informative case:

Rule Accuracy 95% CI Δ vs majority
opinion_pool 0.755 [0.695, 0.815] +0.020
majority_vote (baseline) 0.735 [0.679, 0.788]
approval 0.701 [0.640, 0.757] −0.035
stv 0.693 [0.630, 0.750] −0.043
copeland 0.647 [0.580, 0.710] −0.088
condorcet 0.620 [0.550, 0.685] −0.115
majority (strict) 0.590 [0.520, 0.655] −0.145
borda 0.472 [0.405, 0.540] −0.263

MMLU-Pro

The finding (honest): the value isn't "fancy voting beats majority." It's that the confidence-preserving rule (opinion_pool) wins when the crowd is uncertain (+2.0pp, the only rule above baseline — though its CI still overlaps, so suggestive, not conclusive), while forcing the distributions into full rankings actively hurtsborda collapses to 0.472, far below majority, because with 10 options the tail of the ranking is mostly noise. Aggregate the confidence; don't throw it away. On ARC-Challenge (baseline 96.8%, near-ceiling) nothing separates — every rule lands within overlapping CIs. See RESULTS.md.

Develop

uv run pytest -q
uv run ruff check .
uv run mypy --strict src
# exercise the tools in the MCP Inspector:
npx @modelcontextprotocol/inspector uv run python -m voting_mcp.server

Note: if you keep this repo under an iCloud-synced folder (e.g. ~/Desktop), iCloud can spawn duplicate .pth files that intermittently break the editable install. Tests use pythonpath=src; run the server with PYTHONPATH=src if an import fails, or move the repo off the synced folder.

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured