MCP Servers

Võro MCP Server

An MCP server for working with the Võro language: local dictionary and corpus lookup, GiellaLT-backed analysis/spellcheck/grammar tools, and Neurotõlge translation.

README

Võro MCP Server

An MCP server for working with the Võro language: local dictionary and corpus lookup, GiellaLT-backed analysis/spellcheck/grammar tools, and Neurotõlge translation.

What it's for

This MCP server gives language models practical tools for working with the Võro language. Võro is a lower-resource language, so general-purpose language models may make more mistakes with it than with widely supported languages such as English, or even Estonian.

By connecting the model to dictionaries, corpus search, morphological analysis, spellchecking, grammar checking, translation, and form generation, this server helps improve the model’s ability to understand, generate, correct, and translate Võro text.

It can be useful for tasks such as Võro translation, checking and improving generated text, exploring real usage examples, generating word forms, detecting unknown words, and supporting people who are learning or working with the Võro language.

Install

On Debian/Ubuntu, one command does the whole setup.

make setup        # scripts/run_local_ubuntu.sh; see `make help` for every task

You need make, Python, and the usual shell tools installed first. The setup target installs the HFST/Divvun system binaries (adding the Apertium package repo first if your system can't already find divvun-gramcheck), downloads the SQLite datasets and the prebuilt Giella models, creates .venv, and installs the package. Smoke-test it with make test.

Manual setup

For other platforms, or to see the pieces:

System binaries the Giella tools shell out to: hfst-optimized-lookup, hfst-ospell, cg3, divvun-checker. On Debian/Ubuntu they come from the Apertium nightly apt repo:
```
curl -fsSL https://apertium.projectjj.com/apt/install-nightly.sh | sudo bash
sudo apt-get install -y hfst hfst-ospell cg3 divvun-gramcheck perl gawk bash
```

The package, into a virtualenv (make install):

python3 -m venv .venv
. .venv/bin/activate
pip install -e .

Data and models, they are pulled from GitHub releases:
```
scripts/fetch_data.sh
scripts/fetch_giella.sh
```
Verify the external tools resolved (prints JSON and exits):
```
vro-mcp-check
```

Tools

Tool	What it does
`lookup_word`	Dictionary lookup (en↔vro).
`find_usage_examples`	Full-text corpus search for real usage.
`word_exists_in_bag`	Fast check whether a word form has been seen.
`find_unknown_words`	List word forms in a text absent from the word bag.
`analyze_word`	GiellaLT morphological analysis.
`generate_forms`	GiellaLT generation for one exact lemma + tag analysis.
`spellcheck_vro`	Token-level spellcheck with suggestions.
`grammar_check_vro`	Sentence-level grammar check.
`lint_estonian_leakage`	Flag Estonian-looking endings in Võro text.
`suggest_correction`	Analyzer-verified fixes for a bad/unknown form.
`translate_vro`	Neurotõlge/TartuNLP translation.
`check_setup`	Report database and external Giella tool availability.

Most lookup tools accept a single word or a list for batched queries.

The open dictionary currently covers English↔Võro only.

Resources

Two Markdown grammar references are exposed over MCP:

vro://grammar/noun-cases: noun/adjective/numeral/pronoun declension.
vro://grammar/verb-conjugation: verb conjugation, moods, tenses, voice.

Configuration

The data lives under data/ and is fetched for you. make setup (or make data + make giella) downloads everything, so you normally configure nothing. The fetch scripts honour VRO_DATA_REPO, VRO_DATA_TAG, and VRO_GIELLA_TAG to point at a different release.

All path defaults are repo-local; override any with environment variables or a local .env where the deploy script supports it.

Variable	Default	Description
`VRO_DICTIONARY_DB`	`./data/vro_dictionary.sqlite`	Dictionary SQLite path used by `lookup_word` and correction suggestions.
`VRO_CORPUS_DB`	`./data/vro_corpus.sqlite`	Corpus SQLite path used by `find_usage_examples`.
`VRO_WORD_BAG_DB`	`./data/vro_word_bag.sqlite`	Word-bag SQLite path used by seen/unknown word checks.
`VRO_NEUROTOLGE_BASE_URL`	`https://api.tartunlp.ai/translation/v2`	Neurotõlge/TartuNLP translation API base URL.
`VRO_ANALYZER_CMD`	`./tools/giella/bin/analyze-vro`	Command used for GiellaLT morphological analysis.
`VRO_GENERATOR_CMD`	`./tools/giella/bin/generate-vro`	Command used for one-analysis GiellaLT form generation.
`VRO_SPELLER_CMD`	`./tools/giella/bin/spellcheck-vro`	Command used for token spellchecking.
`VRO_GRAMMAR_CMD`	`./tools/giella/bin/grammar-check-vro`	Command used for sentence grammar checking.
`VRO_ANALYZER_MODEL`	`./data/giella-share/giella/vro/analyser-gt-desc.hfstol`	Model path used by `tools/giella/bin/analyze-vro`.
`VRO_GENERATOR_MODEL`	`./data/giella-share/giella/vro/generator-gt-norm.hfstol`	Model path used by `tools/giella/bin/generate-vro`.
`VRO_SPELLER_MODEL`	`./data/giella-share/voikko/3/vro.zhfst`	Speller archive path used by `tools/giella/bin/spellcheck-vro`.
`VRO_GRAMMAR_MODEL`	`./data/giella-share/voikko/4/vro.zcheck`	Grammar checker archive path used by `tools/giella/bin/grammar-check-vro`.
`VRO_SPELLER_MAX_SUGGESTIONS`	`10`	Maximum spelling suggestions returned per unknown token.
`VRO_DATA_REPO`	`Leo-Martin-Pala/voro-mcp`	GitHub repository used for dataset and Giella release downloads.
`VRO_DATA_TAG`	`data-v1`	GitHub release tag fetched for `vro-data.tar.xz` by `scripts/fetch_data.sh` and Modal release hydration.
`VRO_GIELLA_TAG`	`giella-v1`	GitHub release tag fetched for `giella-share.tar.xz` by `scripts/fetch_giella.sh` and Modal release hydration.
`VRO_GIELLA_BUILD_DIR`	`./.cache/giella-build`	Temporary build directory for `make giella-build`.
`VRO_GIELLA_ARTIFACT_DIR`	`./data/giella-share`	Output directory for locally built Giella artifacts.
`MCP_PATH`	`/mcp` locally; generated in `.env` for Modal deploys	Secret hosted HTTP path segment for Modal; local stdio clients do not need it.
`DATA_SOURCE`	`release`	Modal deploy data source: `release`, `local`, or `none`.
`FORCE_DATA`	`0`	Set to `1` to overwrite existing Modal Volume data.
`DATA_DIR`	`./data`	Local data directory used when `DATA_SOURCE=local`.
`NEW_SECRET`	`0`	Set to `1` to rotate `MCP_PATH` during deploy and save it to `.env`.
`LOCAL_SECRET`	`0`	Set to `1` to push the `MCP_PATH` from `.env`/environment as-is and fail if it is empty (used by `make deploy-local-secret`).
`MODAL_APP_NAME`	`vro-mcp`	Modal app name used by deploy/undeploy scripts.
`MODAL_VOLUME_NAME`	`vro-data`	Modal Volume name for SQLite data and Giella artifacts.
`MODAL_SECRET_NAME`	`vro-mcp-secret`	Modal secret name storing `MCP_PATH`.

Point a client at it

You don't run the server yourself. The MCP client launches the vro-mcp-server binary and talks to it over stdio, so all a client needs is the binary's absolute path. Run make local-url to print that path and ready-to-paste config for Claude Code and Codex.

Claude Code:

claude mcp add vro -- /absolute/path/to/vro-mcp-server/.venv/bin/vro-mcp-server

Codex:

codex mcp add vro -- /absolute/path/to/vro-mcp-server/.venv/bin/vro-mcp-server

Generic JSON MCP client configuration:

{
  "mcpServers": {
    "vro": {
      "command": "/absolute/path/to/vro-mcp-server/.venv/bin/vro-mcp-server",
      "cwd": "/absolute/path/to/vro-mcp-server"
    }
  }
}

Deployment

To host the server in the cloud so Claude or ChatGPT (the web apps) can reach it, see DEPLOY.md.

License

Code is MIT (LICENSE). The SQLite datasets are CC-BY-SA-4.0 and the prebuilt GiellaLT models are GPL-3.0. Both are distributed as separate release assets, each bundling its own license and attribution. See NOTICE.md for scope and source summaries.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured