Florence-2 MCP Server
An MCP server for processing images using Florence-2.
jkawamoto
Tools
ocr
Process an image file or URL using OCR to extract text.
caption
Processes an image file and generates captions for the image.
README
Florence-2 MCP Server
An MCP server for processing images using Florence-2.
You can process images or PDF files stored on a local or web server to extract text using OCR (Optical Character Recognition) or generate descriptive captions summarizing the content of the images.
Installation
For Claude Desktop
To configure this server for Claude Desktop, edit the claude_desktop_config.json
file with the following entry under
mcpServers
:
{
"mcpServers": {
"florence-2": {
"command": "uvx",
"args": [
"--from",
"git+https://github.com/jkawamoto/mcp-florence2",
"mcp-florence2"
]
}
}
}
After editing, restart the application. For more information, see: For Claude Desktop Users - Model Context Protocol.
For Goose CLI
To enable the Bear extension in Goose CLI,
edit the configuration file ~/.config/goose/config.yaml
to include the following entry:
extensions:
bear:
name: Florence-2
cmd: uvx
args: [ --from, git+https://github.com/jkawamoto/mcp-florence2, mcp-florence2 ]
enabled: true
type: stdio
For Goose Desktop
Add a new extension with the following settings:
- Type: Standard IO
- ID: florence-2
- Name: Florence-2
- Description: An MCP server for processing images using Florence-2
- Command:
uvx --from git+https://github.com/jkawamoto/mcp-florence2 mcp-florence2
For more details on configuring MCP servers in Goose Desktop, refer to the documentation: Using Extensions - MCP Servers.
Tools
ocr
Process an image file or URL using OCR to extract text.
Arguments:
- src: A file path or URL to the image file that needs to be processed.
caption
Processes an image file and generates captions for the image.
Arguments:
- src: A file path or URL to the image file that needs to be processed.
License
This application is licensed under the MIT License. See the LICENSE file for more details.
Recommended Servers
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
@kazuph/mcp-fetch
Model Context Protocol server for fetching web content and processing images. This allows Claude Desktop (or any MCP client) to fetch web content and handle images appropriately.
mermaid-mcp-server
A Model Context Protocol (MCP) server that converts Mermaid diagrams to PNG images.
mcp-pinterest
A Pinterest Model Context Protocol (MCP) server for image search and information retrieval
DeepSRT MCP Server
An MCP server that enables users to generate summaries of YouTube videos in multiple languages and formats through integration with DeepSRT's API.
ScreenshotOne MCP Server
An official MCP server implementation that allows AI assistants to capture website screenshots through the ScreenshotOne API, enabling visual context from web pages during conversations.
Glif
Run AI workflows hosted on Glif.app via MCP, including ComfyUI-based image generators, meme generators, selfies, chained LLM calls, and more
WebPerfect MCP Server
An intelligent MCP server with a fully automated batch pipeline for web-ready images. Features include noise reduction, auto levels/curves, JPEG artifact removal, 4K resizing, smart sharpening with shadow/highlight enhancement, and advanced WebP conversion.
Stealth Browser MCP Server
Provides stealth browser capabilities using Playwright with anti-detection techniques, allowing MCP clients to navigate websites and take screenshots while evading common bot detection systems.
MCP-LOGO-GEN
MCP Tool Server for Logo Generation. This server provides logo generation capabilities using FAL AI, with tools for image generation, background removal, and image scaling.