MCP Servers

Hecatoncheire MCP

Multi-agent continuous development system with local LLM orchestration.

README

Hecatoncheire MCP

Multi-agent continuous development system with local LLM orchestration.

⚠️ Work in Progress — This project is under active development. Some features may not work or may behave unexpectedly. Testing and verification are required before any production use. If you find this useful and want to improve it — contributions are very welcome!

⚠️ В разработке — Проект находится в стадии активной разработки. Что-то может не работать или работать не так, как ожидается. Перед использованием необходимо тестировать и проверять. Если решите допилить и использовать — пожалуйста, контрибьютьте!

English | Русский

English

Overview

Hecatoncheire is an MCP (Model Context Protocol) server implementing a continuous multi-agent workflow where specialized AI agents collaborate on code development tasks. The system uses role separation and intelligent feedback loops to prevent common issues like scope creep and unnecessary complexity.

Problem Statement

Single-agent development suffers from inherent limitations:

Cognitive bias when the same agent both writes and reviews code
Tendency toward "improvement for improvement's sake"
Lack of objective alignment checking

Solution

Multi-agent architecture with strict role separation:

Writer Agent: Implements code based on acceptance criteria
Validator Agent: Reviews code and provides targeted feedback
Observer Agent: Local LLM that decomposes tasks and checks alignment

Architecture

User Request
    ↓
Observer (Local LLM)
    └─ Decomposes into acceptance criteria
    └─ Defines success conditions
    ↓
Writer
    └─ Implements code
    └─ Submits for review
    ↓
Validator
    └─ Reviews against criteria
    └─ Approves OR provides feedback
    ↓
[Loop until approved or max iterations]

Core Components

Observer Agent

Local LLM server (llama-cpp-python)
Runs in background, loads model once
Exposes OpenAI-compatible HTTP API
Functions:
- Task decomposition into structured criteria
- Alignment verification (prevents scope creep)
- Objective evaluation of code against original intent

Writer Agent (Chat 1)

Receives structured acceptance criteria
Implements code solution
Submits via write_code() tool
Iterates based on validator feedback

Validator Agent (Chat 2)

Reviews submitted code
Checks alignment with acceptance criteria
Provides specific, actionable feedback
Approves only when criteria are met

Workflow

1. Task Initialization

start_task("Create recursive factorial function in Python")

Observer decomposes into:

REQUIREMENTS: Function signature, base cases, recursive call
FORBIDDEN: Iterative approaches, external libraries
MINIMUM_VIABLE: Basic working implementation
SUCCESS_CRITERIA: Correct results for n=0,1,5

2. Implementation Phase

Writer implements based on criteria and submits:

write_code(
    code="def factorial(n):\n    if n <= 1:\n        return 1\n    return n * factorial(n-1)",
    description="Recursive factorial with base case"
)

3. Validation Phase

Validator reviews and either:

Approves (task complete)
Provides feedback (Writer iterates)

review_code(
    feedback="Missing docstring and type hints",
    approved=False
)

4. Iteration

Writer addresses feedback and resubmits. Loop continues until approval or max iterations reached.

Configuration

All configuration centralized in config.yaml:

Model Configuration

model:
  path: "/models/model.gguf"
  n_ctx: 4096
  n_threads: 8
  n_gpu_layers: -1
  tensor_split: "2,8"  # Multi-GPU split
  split_mode: 1

Observer Configuration

observer:
  api_url: "http://localhost:8000"
  temperature: 0.65
  top_k: 40
  top_p: 0.9
  min_p: 0.05
  repeat_penalty: 1.1
  max_tokens: 512

Prompts

All prompts stored as YAML files in prompts/:

system.yaml: Observer role definition
decompose.yaml: Task decomposition template
check_alignment.yaml: Alignment verification template

Installation

Requirements

Docker with NVIDIA GPU support
CUDA-compatible GPU with 8GB+ VRAM
NVIDIA Container Toolkit

Setup

Clone repository:

git clone https://github.com/srose69/hecatoncheire.git
cd hecatoncheire

Configure model path in config.yaml:

model:
  path: "/models/your-model.gguf"

Update model mount in docker-compose.yml:

volumes:
  - /path/to/your/model.gguf:/models/your-model.gguf:ro
  - ./config.yaml:/app/config.yaml:ro

Build and run:

docker compose build
docker compose up -d

Configure MCP client (see mcp_config_example.json):

{
  "mcpServers": {
    "hecatoncheire": {
      "command": "docker",
      "args": ["exec", "-i", "hecatoncheire", "python", "src/hecatoncheire.py"]
    }
  }
}

Container Architecture

Single unified container running two services:

Observer Server (llama-cpp-python on port 8000)
MCP Server (connects to localhost:8000)

Sequential startup managed by entrypoint.sh:

Starts Observer server in background
Waits for health check
Container stays alive, MCP client connects via docker exec

Testing

Verify Observer API

curl http://localhost:8000/v1/models

curl -X POST http://localhost:8000/v1/completions \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Test prompt", "max_tokens": 50}'

Monitor Resources

docker logs hecatoncheire
nvidia-smi --query-gpu=index,name,memory.used,memory.total --format=csv -l 1

Debugging

Observer not responding:

docker logs hecatoncheire | grep "Observer Server"
docker port hecatoncheire 8000

Out of memory:

Reduce n_ctx in config.yaml
Adjust tensor_split for multi-GPU setups
Lower n_gpu_layers to offload less to GPU

MCP connection issues:

Verify container is running: docker ps
Check MCP client config path
Review container logs for startup errors

Design Principles

Role Separation — each agent has single, well-defined responsibility
Continuous Flow — seamless handoffs without stop-restart cycles
Alignment Checking — constant verification against original intent
Local Processing — no external API dependencies
Configuration as Code — all settings in version-controlled YAML

Limitations

Maximum iterations configurable (default: 3)
Requires GPU with adequate VRAM for model
State does not persist across container restarts
Observer quality depends on local model capabilities

Русский

Обзор

Hecatoncheire — MCP-сервер (Model Context Protocol), реализующий непрерывный мультиагентный рабочий процесс разработки. Специализированные AI-агенты совместно работают над задачами, используя разделение ролей и обратную связь для предотвращения расползания скоупа и избыточной сложности.

Проблема

Одноагентная разработка страдает от врождённых ограничений:

Когнитивное искажение: один и тот же агент пишет и ревьюит код
Склонность к «улучшениям ради улучшений»
Отсутствие объективной проверки соответствия задаче

Решение

Мультиагентная архитектура со строгим разделением ролей:

Writer — реализует код по критериям приёмки
Validator — ревьюит код и даёт обратную связь
Observer — локальная LLM, декомпозирует задачи и проверяет соответствие

Архитектура

Запрос пользователя
    ↓
Observer (локальная LLM)
    └─ Декомпозиция в критерии приёмки
    └─ Определение условий успеха
    ↓
Writer
    └─ Реализация кода
    └─ Отправка на ревью
    ↓
Validator
    └─ Ревью по критериям
    └─ Одобрение ИЛИ обратная связь
    ↓
[Цикл до одобрения или лимита итераций]

Компоненты

Observer

Локальный LLM-сервер (llama-cpp-python)
Работает в фоне, модель загружается один раз
OpenAI-совместимый HTTP API
Декомпозиция задач, проверка alignment, объективная оценка кода

Writer (Чат 1)

Получает структурированные критерии приёмки
Реализует решение
Отправляет через write_code()
Итерирует по фидбеку Validator

Validator (Чат 2)

Ревьюит код
Проверяет соответствие критериям
Даёт конкретный, actionable фидбек
Одобряет только при выполнении всех критериев

Установка

Требования

Docker с поддержкой NVIDIA GPU
CUDA-совместимая GPU с 8GB+ VRAM
NVIDIA Container Toolkit

Настройка

Клонировать репозиторий:

git clone https://github.com/srose69/hecatoncheire.git
cd hecatoncheire

Указать путь к модели в config.yaml:

model:
  path: "/models/your-model.gguf"

Обновить маунт модели в docker-compose.yml:

volumes:
  - /path/to/your/model.gguf:/models/your-model.gguf:ro
  - ./config.yaml:/app/config.yaml:ro

Собрать и запустить:

docker compose build
docker compose up -d

Настроить MCP-клиент (см. mcp_config_example.json):

{
  "mcpServers": {
    "hecatoncheire": {
      "command": "docker",
      "args": ["exec", "-i", "hecatoncheire", "python", "src/hecatoncheire.py"]
    }
  }
}

Отладка

Observer не отвечает:

docker logs hecatoncheire | grep "Observer Server"
docker port hecatoncheire 8000

Нехватка памяти:

Уменьшить n_ctx в config.yaml
Настроить tensor_split для мульти-GPU
Снизить n_gpu_layers

Проблемы с MCP:

Проверить контейнер: docker ps
Проверить конфиг MCP-клиента
Логи контейнера: docker logs hecatoncheire

Принципы

Разделение ролей — каждый агент отвечает за одну задачу
Непрерывный поток — бесшовные переходы между агентами
Проверка alignment — постоянная верификация соответствия исходному запросу
Локальная обработка — никаких внешних API
Конфигурация как код — все настройки в версионируемых YAML

Ограничения

Максимум итераций настраивается (по умолчанию: 3)
Требуется GPU с достаточным VRAM
Состояние не сохраняется между перезапусками контейнера
Качество Observer зависит от возможностей локальной модели

License

This project is licensed under the PolyForm Shield License 1.0.0. See LICENSE for details.

Commercial use, competition, and redistribution for commercial purposes are not permitted.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured