MCP Servers

claude-ollama-mcp

Lets Claude query and manage a local Ollama server — list models, inspect them, run generate/chat completions, pull or delete models.

README

Claude Ollama

Lets Claude Desktop query and manage a local Ollama server. List installed models, inspect them, run one-shot generate/chat completions against any local model, or pull/delete models from the registry — all without opening a terminal.

Typical use: comparing Claude's answer to a local model on the same prompt, running cheap bulk completions against a quantized model, or checking custom training-checkpoint models you've imported into Ollama.

Requirements

A running Ollama server (ollama serve or the Ollama app).
Default endpoint is http://localhost:11434. Override via the ollama_url user config in Claude Desktop's extension settings if you run Ollama on a different host or port.
No npm dependencies — pure Node over the HTTP API.

Install (Claude Desktop)

Download the latest Ollama.mcpb from the Releases page.
In Claude Desktop: Settings → Extensions → Extension Developer → Install Extension → pick the .mcpb.
(Optional) In the extension's settings, set Ollama server URL if you run Ollama on a non-default host/port. Leave blank for http://localhost:11434.

Tools

Tool	Annotation	Purpose
`ollama_status`	read-only	Health check + server version
`list_models`	read-only	Local models with size, digest, family, parameter size, quantization
`list_running`	read-only	Models currently loaded in VRAM
`show_model`	read-only	Model details: modelfile, parameters, template, capabilities
`generate`	open-world	One-shot text completion (non-streaming)
`chat`	open-world	Chat completion with message history (non-streaming)
`pull_model`	open-world	Download a model from the registry
`delete_model`	destructive	Remove a locally-installed model

Example prompts

"Which local models do I have installed, and which one is currently loaded in VRAM?"

"Run forge:b6c1 on this prompt: '<blah>'. Compare that output to your own answer."

"Show me the modelfile for forge:b7c1 — I want to check the temperature setting."

"Pull llama3.1:70b." (expect a long wait for large models)

"Delete the forge:b5c3 model — I don't need that checkpoint anymore."

Privacy policy

This extension runs entirely on your local machine and sends HTTP requests only to your Ollama server (default http://localhost:11434). No data leaves your machine unless you explicitly configure ollama_url to point at a remote Ollama instance, in which case the prompts and responses travel to that server.

The information visible to Claude includes:

All prompts and chat messages you pass to generate and chat (these go to the Ollama server, which may log them depending on its configuration).
Full text of completions returned by Ollama.
Metadata for every installed model (names, digests, sizes, quantization, modelfile contents).
Which models are currently loaded in VRAM and their size footprint.

If you have installed models containing proprietary fine-tunes or modelfiles with sensitive metadata, note that Claude will see that information when you call show_model or list_models.

delete_model is destructive and cannot be undone from this extension — the model must be re-pulled from the registry (or re-imported from source blobs) if deleted by mistake.

Troubleshooting

"cannot reach Ollama at http://localhost:11434 — is the server running?" — Start Ollama with ollama serve or launch the Ollama app. Verify with curl http://localhost:11434/ (should return "Ollama is running").

pull_model hangs for a long time — Ollama's pull API with stream: false blocks until the full download completes, which for multi-GB models can take many minutes. If you're pulling a huge model, run ollama pull <name> in a terminal instead — you'll see streaming progress there, and subsequent MCP calls will find the model already installed.

Custom/remote Ollama endpoint — Set ollama_url in the extension's settings (e.g. http://192.168.1.42:11434). Requires restart of the extension.

list_running shows a model after you stopped using it — Ollama keeps models hot in VRAM for a configurable TTL (default 5 minutes). The expires_at timestamp tells you when it'll unload. This is Ollama's behavior, not the extension's.

Development

Single ~400-line Node.js script, zero npm dependencies. Rebuild the .mcpb:

cd bundle-source
zip -j ../Ollama.mcpb manifest.json package.json server.js README.md LICENSE icon.png glama.json

License

MIT. See LICENSE.

claude-terminal-mcp — shell, filesystem, and background jobs.
claude-rocm-mcp — AMD GPU monitoring; pairs well for checking whether Ollama's loaded model is saturating VRAM.
claude-sessions-mcp — tmux session management for long-running jobs.
claude-linux-mcp — X11 desktop control.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

claude-ollama-mcp

README

Claude Ollama

Requirements

Install (Claude Desktop)

Tools

Example prompts

Privacy policy

Troubleshooting

Development

License

Related

Recommended Servers