DialectOS

DialectOS

Spanish dialect localization MCP server and CLI. It translates and QA-checks content across 25 regional variants with register control, structure preservation, and adversarial quality gates.

Category
Visit Server

README

<p align="center"> <img src="assets/dialectos-hero.webp" alt="DialectOS β€” Spanish dialect localization MCP server: translate and QA across 25 regional variants" width="100%"> </p>

<div align="center">

🌎 DialectOS

The first Model Context Protocol server built specifically for Spanish dialects.

DialectOS is an open-source Spanish dialect translation server that runs as an MCP (Model Context Protocol) tool and CLI. It translates English and other languages into 25 regional Spanish variants β€” Mexican, Argentinian, Colombian, Puerto Rican, and more β€” while preserving markdown structure, enforcing glossary terms, and applying adversarial quality gates that catch semantic drift before it reaches users.

Translate, detect, and adapt content across 25 regional Spanish variants while preserving markdown structure, code comments, and locale file formatting.

CI Tests License Node pnpm MCP Security

πŸ“– Documentation Β· πŸš€ Quick Start Β· πŸ› οΈ MCP Tools Β· Agent Skill Β· πŸ“¦ Packages Β· 🀝 Contributing Β· πŸ“‹ Roadmap

</div>


Public Discovery

DialectOS is a Spanish localization and dialect QA system for AI agents, documentation teams, app developers, and support organizations. It provides MCP tools, CLI workflows, glossary enforcement, locale-file validation, and adversarial quality gates for regional Spanish variants.

AI discovery: llms.txt provides a compact project summary for AI assistants and search crawlers.

Best-fit searches: Spanish dialect translation MCP server, Spanish localization QA, Model Context Protocol translation tool, i18n validation CLI, regional Spanish translator, glossary enforcement, AI localization audit, Spanish launch certification.

Agent Skill

DialectOS includes a public agent skill at skills/dialectos/SKILL.md. Use $dialectos in compatible agent hosts when you want an agent to choose the right MCP or CLI workflow for regional Spanish translation, markdown preservation, locale-file validation, glossary enforcement, register checks, and launch-readiness QA.

Spanish Launch Certification

DialectOS is available as a paid Spanish localization launch audit. We certify your Spanish docs, app strings, support macros, or locale files across target dialects and deliver an MQM-aligned launch-readiness report.

✨ What makes DialectOS different?

Feature Google Translate DeepL API DialectOS
Spanish dialect awareness ❌ Generic "Spanish" ⚠️ Limited variants βœ… 25 regional variants
MCP native integration ❌ ❌ βœ… 17 MCP tools
Markdown structure preservation ❌ ❌ βœ… Tables, code blocks, links intact
i18n locale file support ❌ ❌ βœ… JSON locale diff & merge
Gender-neutral language ❌ ❌ βœ… elles / latine / -x
Formality checking (tΓΊ vs usted) ❌ ❌ βœ… Cross-dialect consistency
Adversarial quality gates ❌ ❌ βœ… Semantic drift + structure validation
LLM-first dialect adaptation ❌ Generic MT ⚠️ Limited dialect control βœ… Any OpenAI/Anthropic/LM Studio local LLM + dialect contracts
Translation validation (any provider) ❌ ❌ βœ… dialectos validate β€” standalone correctness check
GitHub CI integration ❌ ❌ βœ… Composite action for PR validation
Auto-glossary from corrections ❌ ❌ βœ… Learns from user feedback
Public benchmark suite ❌ ❌ βœ… 205 adversarial samples across 25 dialects

🎯 Why this exists

"We shipped a product to Mexico using our Spain Spanish translations. Users thought we were being intentionally rude."

Spanish is not one language β€” it's 25 regional variants with different vocabulary, formality levels, slang, and grammatical preferences. Existing translation tools treat Spanish as a monolith.

DialectOS solves this by:

  • Understanding regional differences (es-MX vs es-ES vs es-AR vs es-CO...)
  • Preserving technical document structure during translation
  • Providing glossary enforcement for consistent terminology
  • Adding semantic context, dialect grammar profiles, quality contracts, and quality gates that catch drift before it reaches users
  • Running as an MCP server so AI assistants can translate natively

πŸš€ Quick Start

Note: DialectOS packages are not yet published to npm. Setup requires cloning the repo and building from source (~2–5 minutes first time).

Full-app browser demo

The browser demo is no longer a fake/static rule replacer. It calls a local DialectOS backend, and that backend calls the configured provider stack.

LLM_API_URL="http://127.0.0.1:1234/v1/chat/completions" \
LLM_API_FORMAT="openai" \
LLM_MODEL="your-local-model-name" \
LLM_ALLOW_LOCAL=1 \
pnpm demo

Open http://127.0.0.1:8080.

For the beginner container walkthrough, see docs/full-app-demo.md.

Local MCP setup

After pnpm build, add to your Claude Desktop, Cursor, or any MCP client:

{
  "mcpServers": {
    "dialectos": {
      "command": "node",
      "args": ["packages/mcp/dist/index.js"],
      "comment": "Local development setup β€” see README for clone and build instructions.",
      "env": {
        "LLM_API_URL": "https://your-llm-gateway/v1/chat/completions",
        "LLM_MODEL": "your-dialect-capable-model",
        "LLM_API_KEY": "your-key-if-required",
        "LLM_API_FORMAT": "openai",
        "ALLOWED_LOCALE_DIRS": "/path/to/locales"
      }
    }
  }
}

Recommended certified models

For v0.3.0, the recommended default cloud model is glm-4.5-air through the Z.ai international Anthropic-compatible endpoint. It passed basic, expanded adversarial, and long-document certification. Use glm-5.1 when you want the higher-confidence/premium option, and qwen3.5-9b via LM Studio for local/offline certification.

export LLM_API_URL="https://api.z.ai/api/anthropic/v1/messages"
export LLM_MODEL="glm-4.5-air"
export LLM_API_FORMAT="anthropic"
export LLM_API_KEY="..."

LM Studio local model testing

Start LM Studio's local server, then point DialectOS at any downloaded local model. LLM_API_FORMAT=lmstudio uses LM Studio's native REST API and loads the model just-in-time when needed.

LM_STUDIO_URL="http://127.0.0.1:1234" \
LLM_MODEL="publisher/model-key-or-api-identifier" \
LLM_API_FORMAT="lmstudio" \
pnpm dialect:eval -- --live --provider=llm --out=/tmp/dialectos-lmstudio-eval

Incremental provider certification

Use dialect:certify for long local-model or cloud-provider runs. It writes events.jsonl, progress.json, and an incrementally updated results.json after every sample, with per-sample timeout protection.

LM_STUDIO_URL="http://127.0.0.1:1234" \
LLM_MODEL="qwen3.5-9b" \
LLM_API_FORMAT="lmstudio" \
pnpm dialect:certify -- --live --provider=llm --sample-timeout-ms=300000 --out=/tmp/dialectos-certify

Adversarial dialect certification

Use dialect:certify:adversarial to run paraphrase, dialect-collision, taboo-copy, placeholder, register, and repeatability traps. It wraps dialect:certify and writes a failure-matrix.md plus aggregate repeatability results.

pnpm dialect:certify:adversarial -- --live --provider=llm --repeat=2 --sample-timeout-ms=300000 --out=/tmp/dialectos-adversarial

Long-document certification

Use dialect:certify:documents to certify README/API-doc/locale JSON flows, not just sentence fixtures. It checks markdown structure, placeholders, URLs, code fences, API tables, and locale JSON outputs.

pnpm dialect:certify:documents -- --live --provider=llm --dialects=es-MX,es-PA,es-PR --out=/tmp/dialectos-doc-cert

Customer-facing certification report

Use dialect:report to turn certification artifacts into a customer-facing Markdown deliverable for paid launch audits.

pnpm dialect:report -- --input=audits/release-candidate-2026-04-22/model-matrix.json --out=customer-report.md --customer="Acme SaaS" --product="Spanish launch"

CLI install

# Install the v0.3.0 CLI tarball from the GitHub Release
pnpm add -g https://github.com/KyaniteLabs/DialectOS/releases/download/v0.3.0/dialectos-cli-0.3.0.tgz

# Or use a local source checkout
pnpm install --frozen-lockfile
pnpm build

# Translate to Mexican Spanish
dialectos translate "Hello world" --dialect es-MX

# Translate a README preserving structure
dialectos translate-readme README.md --dialect es-AR --output README.ar.md

# Validate an existing translation
dialectos validate --source "Click the button" --translated "Haz clic en el botΓ³n" --dialect es-MX

# Validate translation files
dialectos validate --source-file en.json --translated-file es-MX.json --dialect es-MX --format json

# View translation corpus statistics
dialectos corpus stats

# Run dialect quality benchmark
dialectos benchmark run --dialects es-MX,es-AR,es-ES

# Generate glossary suggestions from corrections
dialectos glossary suggest --min-occurrences 3

# Compare two glossary versions
dialectos glossary diff glossary-v1.json glossary-v2.json

# Detect missing i18n keys
dialectos i18n detect-missing ./locales/en.json ./locales/es.json

# List all supported dialects
dialectos dialects list

From source

git clone https://github.com/KyaniteLabs/DialectOS.git
cd DialectOS
pnpm install
pnpm build
pnpm test        # 662+ tests passing

πŸ› οΈ MCP Tools

Markdown Translation (4 tools)

Tool Description
translate_markdown Translate while preserving tables, code blocks, links
extract_translatable Extract only translatable text from markdown
translate_api_docs Translate API docs with table cell-level translation
create_bilingual_doc Side-by-side bilingual documents

i18n Operations (6 tools)

Tool Description
detect_missing_keys Compare locale files for missing keys
translate_missing_keys Auto-translate missing keys
batch_translate_locales Batch translate to multiple dialects
manage_dialect_variants Create dialect-specific variants
check_formality Check tΓΊ vs usted consistency
apply_gender_neutral Apply gender-neutral language

Translation (7 tools)

Tool Description
translate_text Translate with semantic context, grammar profiles, and quality contracts
detect_dialect Detect dialect from sample text
translate_code_comment Translate comments, preserve code
translate_readme Full README translation pipeline
search_glossary Search 300+ source-attributed glossary terms
list_dialects List all 25 supported dialects
research_regional_term Research source-backed regional lexeme proposals without mutating runtime data

πŸ“¦ Packages

Package Version Description Tests
@dialectos/mcp 0.3.0 17 MCP tools (stdio server) 93
@dialectos/cli 0.3.0 CLI: translate, validate, corpus, benchmark, glossary 569
@dialectos/providers 0.3.0 LLM, DeepL, LibreTranslate, MyMemory with circuit breaker + corpus 152
@dialectos/security 0.3.0 Rate limiting, SSRF protection, sanitization 68
@dialectos/types 0.3.0 Shared TypeScript types + glossary, profile, certification, and quality data 54
@dialectos/locale-utils 0.3.0 Locale file diff/merge utilities 55
@dialectos/markdown-parser 0.3.0 Structure-preserving markdown parser 74

662+ tests across 7 packages plus docs contracts, demo-server contracts, and static-hardening checks


πŸ›‘οΈ Security & Quality

DialectOS has undergone adversarial security hardening:

  • 18 CVEs resolved via dependency overrides
  • SSRF protection on all provider endpoints
  • Circuit breaker with half-open probe locks
  • Atomic checkpoint writes with schema versioning
  • HTML injection detection in translated output
  • Semantic drift scoring β€” catches "looks valid but meaning changed"
  • Provider capability negotiation β€” validates language support before API calls
  • Chaos harness for deterministic resilience testing

See SECURITY.md for details.


🎨 Supported Dialects

Code Region Example Difference
es-ES Spain "Coche" (car), "Ordenador" (computer)
es-MX Mexico "Carro", "Computadora"
es-AR Argentina "Auto", "Computadora", "Che"
es-CO Colombia "Carro", "Computador", "ChΓ©vere"
es-CL Chile "Auto", "Computadora", "Caleta"
es-PE Peru "Carro", "Computadora", "Pe"
es-VE Venezuela "Carro", "Computadora", "Chamo"
es-UY Uruguay "Auto", "Computadora", "Bo"
es-GQ Equatorial Guinea "Carro", "Camisola", "Bacalao"
es-US United States "Carro", "Computadora", "Pocha"
es-PH Philippines (Chavacano) "Carro", "Jendeh", "Kame"
es-BZ Belize "Carro", "Breki", "Kriol"
es-AD Andorra "Carro", "Madriu", "Caldea"

...and 12 more. Full list via dialectos dialects list.


πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        MCP Client                            β”‚
β”‚              (Claude Desktop / Cursor / etc.)                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚ stdio
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   @dialectos/mcp                               β”‚
β”‚              17 tools β€’ JSON-RPC over stdio                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   @dialectos/cli                               β”‚
β”‚   translate β€’ validate β€’ corpus β€’ benchmark β€’ glossary     β”‚
β”‚   β”œβ”€ Policy profiles (strict/balanced/permissive)           β”‚
β”‚   β”œβ”€ Quality gates (token/glossary/structure/semantic)      β”‚
β”‚   β”œβ”€ Translation corpus + auto-glossary                     β”‚
β”‚   └─ Checkpoint resumption + telemetry                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                @dialectos/providers                            β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚   β”‚   LLM   β”‚  β”‚     DeepL       β”‚  β”‚ Libre/MyMemory β”‚   β”‚
β”‚   β”‚ Primary β”‚  β”‚ Paid fallback   β”‚  β”‚ Generic fallbackβ”‚    β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚        β”‚                β”‚                    β”‚              β”‚
β”‚        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”‚              Circuit Breaker + Rate Limiter                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“Š Quality Gates

Every translation passes through 4 quality dimensions:

Quality Score = tokenIntegrityΓ—25% + glossaryFidelityΓ—30% + structureIntegrityΓ—20% + semanticSimilarityΓ—25%
Gate What it checks Example failure
Token Integrity Protected terms preserved "Kyanite Labs" β†’ "Cianita Labs"
Glossary Fidelity Enforced terminology used "API" β†’ "Interfaz" (when glossary says "API")
Structure Integrity Markdown structure intact Missing code fence, broken table
Semantic Similarity Meaning not drifted "API is down" β†’ "Hello world"

❓ FAQ

What is DialectOS? DialectOS is an open-source translation engine for Spanish regional dialects. It runs as an MCP server (for AI assistants like Claude) and a CLI tool for developers.

How is DialectOS different from Google Translate? Google Translate treats Spanish as one language. DialectOS understands 25 regional variants, preserves markdown structure, enforces glossaries, and applies quality gates that catch errors before they reach users.

What are Spanish dialects? Spanish varies significantly by country. Mexican Spanish uses "carro" for car; Spain uses "coche"; Argentina uses "auto". DialectOS handles these differences automatically.

Does DialectOS work with ChatGPT / Claude? Yes. DialectOS is an MCP server, so Claude Desktop, Cursor, Windsurf, and other MCP clients can use its 17 translation tools natively.

Is DialectOS free? Yes. DialectOS v0.3.0 is released under Apache-2.0. See LICENSE for details.

What is MCP? Model Context Protocol is an open standard that lets AI assistants use external tools. DialectOS exposes 17 translation tools through MCP so AI agents can translate natively.

Can I use DialectOS for commercial projects? Yes. Apache-2.0 allows commercial use, modification, and redistribution subject to the license terms. See LICENSE for details.

How accurate is the translation? DialectOS applies 4 quality gates (token integrity, glossary fidelity, structure integrity, semantic similarity) and adversarial tests. Automated tests verify correctness across dialects.

🏷️ Badges

Add this badge to your project if you use DialectOS for translation:

[![Translated with DialectOS](https://img.shields.io/badge/translated%20with-DialectOS-d89b2b)](https://github.com/KyaniteLabs/DialectOS)

⚑ GitHub Action

Validate Spanish translations in CI on every pull request:

- uses: KyaniteLabs/DialectOS/action  # Planned β€” version pinning unavailable until first release
  with:
    dialect: es-MX
    source-dir: locales/en
    target-patterns: 'locales/es-MX/*.json'
    fail-on-blocking: true

Multi-dialect matrix:

strategy:
  matrix:
    dialect: [es-ES, es-MX, es-AR, es-CO]
steps:
  - uses: KyaniteLabs/DialectOS/action  # Planned β€” version pinning unavailable until first release
    with:
      dialect: ${{ matrix.dialect }}
      fail-on-blocking: true

See docs/github-action.md for full configuration options.


🀝 Contributing

We welcome contributors! See CONTRIBUTING.md for:

  • Setting up your development environment
  • Running the test suite
  • Submitting pull requests
  • Code style guidelines

Good first issues are tagged with good first issue β€” perfect for newcomers.


πŸ“‹ Roadmap

See ROADMAP.md for upcoming features including:

  • Portuguese dialect support (pt-BR, pt-PT)
  • Real-time collaborative translation
  • Custom provider plugins
  • OpenAI-compatible, Anthropic-compatible, and LM Studio local gateways via LLM_API_URL/LM_STUDIO_URL + LLM_MODEL + LLM_API_FORMAT
  • VS Code extension

πŸ“„ License

Apache-2.0 β€” see LICENSE for details.


<div align="center">

Made with ❀️ by KyaniteLabs and contributors.

Star ⭐ this repo if it helps your project!

</div>


Part of KyaniteLabs

More from KyaniteLabs. Related projects:

  • mcp-video β€” guardrailed video-editing MCP server for AI agents
  • Epoch β€” time-estimation MCP server (PERT) for AI agents
  • checkyourself β€” local-first production-readiness checks for AI-built code

β†’ More at kyanitelabs.tech

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured