mcplex
An MCP server that bridges local Ollama models and ChromaDB vector memory to MCP clients like Claude Code. It enables local text generation, vision-based image analysis, and semantic memory storage without requiring external API keys.
README
mcplex
MCP server for local AI models -- expose Ollama, embeddings, vision, and vector memory to Claude Code and other MCP clients.
What is this?
mcplex is a Model Context Protocol server that bridges your local AI models to any MCP client. It gives Claude Code (or any MCP-compatible tool) direct access to:
- Ollama models -- generate text, chat, and list available models
- Embeddings -- generate vector embeddings via local embedding models
- Vision -- analyze images and extract text using local vision models (LLaVA, etc.)
- Vector memory -- store and semantically search text using ChromaDB
Everything runs locally. No API keys needed. No data leaves your machine.
Features
| Category | Tools | Description |
|---|---|---|
| Text Generation | generate |
One-shot text generation with any Ollama model |
| Chat | chat |
Multi-turn conversation with message history |
| Embeddings | embed |
Generate vector embeddings for text |
| Model Management | list_models |
List all available Ollama models |
| Vision | analyze_image |
Describe/analyze images with a vision model |
| OCR | ocr_image |
Extract text from images |
| Memory Store | memory_store |
Store text + metadata in ChromaDB |
| Memory Search | memory_search |
Semantic search over stored memories |
| Memory List | memory_list_collections |
List all memory collections |
Requirements
- Python 3.10+
- Ollama running locally (default:
http://localhost:11434) - At least one Ollama model pulled (e.g.,
ollama pull qwen3:8b)
Installation
# From PyPI (when published)
pip install mcplex
# With vector memory support
pip install mcplex[memory]
# From source
git clone https://github.com/dbhavery/mcplex.git
cd mcplex
pip install -e ".[memory,dev]"
Claude Code Integration
Add mcplex to your Claude Code MCP configuration:
{
"mcpServers": {
"mcplex": {
"command": "mcplex",
"args": []
}
}
}
Or if running from source:
{
"mcpServers": {
"mcplex": {
"command": "python",
"args": ["-m", "mcplex.server"]
}
}
}
Once configured, Claude Code can use your local models directly:
"Use the generate tool to summarize this file with qwen3:8b"
"Embed these three paragraphs and store them in the 'research' collection"
"Analyze this screenshot and extract all visible text"
Tool Reference
generate
Send a prompt to a local Ollama model.
| Parameter | Type | Default | Description |
|---|---|---|---|
prompt |
str |
required | The text prompt |
model |
str |
qwen3:8b |
Ollama model name |
temperature |
float |
0.7 |
Sampling temperature (0.0-2.0) |
max_tokens |
int |
2048 |
Maximum tokens to generate |
chat
Multi-turn chat with message history.
| Parameter | Type | Default | Description |
|---|---|---|---|
messages |
list[{role, content}] |
required | Message history |
model |
str |
qwen3:8b |
Ollama model name |
temperature |
float |
0.7 |
Sampling temperature |
max_tokens |
int |
2048 |
Maximum tokens |
embed
Generate vector embeddings.
| Parameter | Type | Default | Description |
|---|---|---|---|
text |
str | list[str] |
required | Text to embed |
model |
str |
nomic-embed-text |
Embedding model |
list_models
List all available Ollama models. No parameters.
analyze_image
Analyze an image with a local vision model.
| Parameter | Type | Default | Description |
|---|---|---|---|
image_path |
str |
required | Path to image file |
prompt |
str |
"Describe this image in detail." |
Question/instruction |
model |
str |
llava |
Vision model name |
ocr_image
Extract text from an image.
| Parameter | Type | Default | Description |
|---|---|---|---|
image_path |
str |
required | Path to image file |
model |
str |
llava |
Vision model name |
memory_store
Store text in vector memory.
| Parameter | Type | Default | Description |
|---|---|---|---|
text |
str |
required | Text to store |
metadata |
dict |
None |
Optional key-value metadata |
collection |
str |
"default" |
ChromaDB collection name |
memory_search
Semantic search over stored memories.
| Parameter | Type | Default | Description |
|---|---|---|---|
query |
str |
required | Search query |
n_results |
int |
5 |
Max results to return |
collection |
str |
"default" |
ChromaDB collection name |
memory_list_collections
List all ChromaDB collections. No parameters.
Configuration
All configuration is via environment variables (or a .env file):
| Variable | Default | Description |
|---|---|---|
MCPLEX_OLLAMA_URL |
http://localhost:11434 |
Ollama server URL |
MCPLEX_DEFAULT_MODEL |
qwen3:8b |
Default text model |
MCPLEX_EMBED_MODEL |
nomic-embed-text |
Default embedding model |
MCPLEX_VISION_MODEL |
llava |
Default vision model |
MCPLEX_CHROMA_PATH |
./mcplex_data/chroma |
ChromaDB storage path |
MCPLEX_DEFAULT_TEMPERATURE |
0.7 |
Default sampling temperature |
MCPLEX_DEFAULT_MAX_TOKENS |
2048 |
Default max tokens |
Architecture
MCP Client (Claude Code, etc.)
|
| stdio (JSON-RPC)
|
mcplex server (FastMCP)
|
+-- ollama_tools -----> Ollama API (HTTP)
| localhost:11434
+-- vision_tools -----> Ollama API (with images)
|
+-- memory_tools -----> ChromaDB (local persistent)
- Transport: stdio (standard for CLI-based MCP clients)
- Ollama communication: async HTTP via httpx
- Vector storage: ChromaDB with persistent client (lazy-loaded)
- No API keys required -- everything runs locally
Development
git clone https://github.com/dbhavery/mcplex.git
cd mcplex
pip install -e ".[memory,dev]"
# Run tests
python -m pytest tests/ -v
# Run the server
mcplex
# or
python -m mcplex.server
License
MIT -- Copyright (c) 2026 Donald Havery
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.