mcplex

mcplex

An MCP server that bridges local Ollama models and ChromaDB vector memory to MCP clients like Claude Code. It enables local text generation, vision-based image analysis, and semantic memory storage without requiring external API keys.

Category
Visit Server

README

mcplex

MCP server for local AI models -- expose Ollama, embeddings, vision, and vector memory to Claude Code and other MCP clients.

License: MIT Python 3.10+ MCP


What is this?

mcplex is a Model Context Protocol server that bridges your local AI models to any MCP client. It gives Claude Code (or any MCP-compatible tool) direct access to:

  • Ollama models -- generate text, chat, and list available models
  • Embeddings -- generate vector embeddings via local embedding models
  • Vision -- analyze images and extract text using local vision models (LLaVA, etc.)
  • Vector memory -- store and semantically search text using ChromaDB

Everything runs locally. No API keys needed. No data leaves your machine.

Features

Category Tools Description
Text Generation generate One-shot text generation with any Ollama model
Chat chat Multi-turn conversation with message history
Embeddings embed Generate vector embeddings for text
Model Management list_models List all available Ollama models
Vision analyze_image Describe/analyze images with a vision model
OCR ocr_image Extract text from images
Memory Store memory_store Store text + metadata in ChromaDB
Memory Search memory_search Semantic search over stored memories
Memory List memory_list_collections List all memory collections

Requirements

  • Python 3.10+
  • Ollama running locally (default: http://localhost:11434)
  • At least one Ollama model pulled (e.g., ollama pull qwen3:8b)

Installation

# From PyPI (when published)
pip install mcplex

# With vector memory support
pip install mcplex[memory]

# From source
git clone https://github.com/dbhavery/mcplex.git
cd mcplex
pip install -e ".[memory,dev]"

Claude Code Integration

Add mcplex to your Claude Code MCP configuration:

{
  "mcpServers": {
    "mcplex": {
      "command": "mcplex",
      "args": []
    }
  }
}

Or if running from source:

{
  "mcpServers": {
    "mcplex": {
      "command": "python",
      "args": ["-m", "mcplex.server"]
    }
  }
}

Once configured, Claude Code can use your local models directly:

"Use the generate tool to summarize this file with qwen3:8b"

"Embed these three paragraphs and store them in the 'research' collection"

"Analyze this screenshot and extract all visible text"

Tool Reference

generate

Send a prompt to a local Ollama model.

Parameter Type Default Description
prompt str required The text prompt
model str qwen3:8b Ollama model name
temperature float 0.7 Sampling temperature (0.0-2.0)
max_tokens int 2048 Maximum tokens to generate

chat

Multi-turn chat with message history.

Parameter Type Default Description
messages list[{role, content}] required Message history
model str qwen3:8b Ollama model name
temperature float 0.7 Sampling temperature
max_tokens int 2048 Maximum tokens

embed

Generate vector embeddings.

Parameter Type Default Description
text str | list[str] required Text to embed
model str nomic-embed-text Embedding model

list_models

List all available Ollama models. No parameters.

analyze_image

Analyze an image with a local vision model.

Parameter Type Default Description
image_path str required Path to image file
prompt str "Describe this image in detail." Question/instruction
model str llava Vision model name

ocr_image

Extract text from an image.

Parameter Type Default Description
image_path str required Path to image file
model str llava Vision model name

memory_store

Store text in vector memory.

Parameter Type Default Description
text str required Text to store
metadata dict None Optional key-value metadata
collection str "default" ChromaDB collection name

memory_search

Semantic search over stored memories.

Parameter Type Default Description
query str required Search query
n_results int 5 Max results to return
collection str "default" ChromaDB collection name

memory_list_collections

List all ChromaDB collections. No parameters.

Configuration

All configuration is via environment variables (or a .env file):

Variable Default Description
MCPLEX_OLLAMA_URL http://localhost:11434 Ollama server URL
MCPLEX_DEFAULT_MODEL qwen3:8b Default text model
MCPLEX_EMBED_MODEL nomic-embed-text Default embedding model
MCPLEX_VISION_MODEL llava Default vision model
MCPLEX_CHROMA_PATH ./mcplex_data/chroma ChromaDB storage path
MCPLEX_DEFAULT_TEMPERATURE 0.7 Default sampling temperature
MCPLEX_DEFAULT_MAX_TOKENS 2048 Default max tokens

Architecture

MCP Client (Claude Code, etc.)
    |
    | stdio (JSON-RPC)
    |
mcplex server (FastMCP)
    |
    +-- ollama_tools -----> Ollama API (HTTP)
    |                        localhost:11434
    +-- vision_tools -----> Ollama API (with images)
    |
    +-- memory_tools -----> ChromaDB (local persistent)
  • Transport: stdio (standard for CLI-based MCP clients)
  • Ollama communication: async HTTP via httpx
  • Vector storage: ChromaDB with persistent client (lazy-loaded)
  • No API keys required -- everything runs locally

Development

git clone https://github.com/dbhavery/mcplex.git
cd mcplex
pip install -e ".[memory,dev]"

# Run tests
python -m pytest tests/ -v

# Run the server
mcplex
# or
python -m mcplex.server

License

MIT -- Copyright (c) 2026 Donald Havery

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured