MCP Servers

Venice AI Image Generator MCP Server

testing mcp-server functionality venice and gemini (images)

jhacksman

Research & Data

README

Venice AI Image Generator MCP Server

This project implements a Model Context Protocol (MCP) server that integrates with Venice AI for image generation with an approval/regeneration workflow.

What is MCP?

The Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to Large Language Models (LLMs). It acts as a "USB-C port for AI applications," allowing LLMs to connect to various data sources and tools in a standardized way.

For more information, visit the official MCP introduction page.

Project Overview

This MCP server provides a bridge between LLMs (like Claude) and Venice AI's image generation capabilities. It enables LLMs to generate images based on text prompts and implements an interactive approval workflow with thumbs up/down feedback.

Key Features

Image Generation with Approval Workflow

The core functionality of this server is to:

Generate images using Venice AI based on text prompts
Display the generated image to the user with clickable thumbs up/down icons overlaid directly on the image
Allow users to approve the image (clicking thumbs up) or request a regeneration (clicking thumbs down)
Regenerate images with the same parameters if requested

Technical Implementation

The server implements several MCP tools:

generate_venice_image: Creates an image from a text prompt and returns it with approval options
approve_image: Marks an image as approved when the user gives a thumbs up
regenerate_image: Creates a new image with the same parameters when the user gives a thumbs down
list_available_models: Provides information about available Venice AI models

User Experience

From the user's perspective, the interaction flow is:

User provides a text prompt to generate an image
LLM calls the MCP server to generate the image
LLM displays the image with clickable thumbs up/down icons overlaid directly on the image
User clicks the thumbs up icon on the image to approve or thumbs down icon to regenerate
If thumbs down, the process repeats until the user approves an image

Architecture

The server follows the MCP client-server architecture:

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│             │     │             │     │             │
│  LLM Host   │◄────┤  MCP Server │◄────┤  Venice AI  │
│ (e.g. Claude)│     │             │     │    API     │
│             │     │             │     │             │
└─────────────┘     └─────────────┘     └─────────────┘

LLM Host: The application running the LLM (e.g., Claude)
MCP Server: Our server that implements the MCP protocol and connects to Venice AI
Venice AI API: The external service that generates images

Implementation Details

MCP Server Components

The server consists of:

FastMCP Server: The core server that handles MCP protocol communication
Venice AI Integration: Code that interfaces with the Venice AI API
Image Cache: In-memory storage for tracking generated images and their approval status
Tool Definitions: Functions that LLMs can call to interact with the server

Data Flow

LLM receives a prompt from the user
LLM calls the generate_venice_image tool with the prompt
Server sends request to Venice AI API
Venice AI generates the image and returns a URL
Server caches the image details and returns the URL with approval options
LLM displays the image and approval options to the user
User selects thumbs up or thumbs down
LLM calls either approve_image or regenerate_image based on user selection
If regenerating, the process repeats from step 3

Example Usage

When connected to an LLM like Claude, the interaction would look like:

User: Generate an image of a futuristic city skyline
Claude: I'll generate that image for you using Venice AI.

[Image of futuristic city skyline with clickable 👍 and 👎 icons overlaid on the image]

User: 👎 (Thumbs down)
Claude: Let me generate a new version for you.

[New image of futuristic city skyline with clickable 👍 and 👎 icons overlaid on the image]

User: 👍 (Thumbs up)
Claude: Great! I've saved this approved image for you.

Gemini Integration for Multi-View Generation

After a user approves an image (by clicking the thumbs up icon), the system automatically processes the approved image through Google's Gemini API to generate multiple consistent views of the 3D object:

The approved Venice AI image is used as input to the Gemini view generation scripts
Four different views are generated sequentially:
- Front view (0°) - Generated first
- Right view (90°) - Generated after front view completes
- Left view (270°) - Generated after right view completes
- Back view (180°) - Generated after left view completes
Each view is displayed in a 4-up layout as it becomes available
Each script waits for the previous script to complete successfully before executing

4-Up View Approval Process

Each of the four generated views has its own thumbs up/down approval system:

Each view in the 4-up display has thumbs up/down icons overlaid on the image
If a user selects thumbs down for any specific view:
- The corresponding Python script for that view is run again
- The newly generated image replaces the rejected image in the 4-up display
- This process repeats until the user approves the image with thumbs up
Each view can be individually approved or regenerated

3D Model Generation

Once all four views are approved:

The original Venice AI image and the four approved Gemini-generated views are processed using CUDA Multi-View Stereo
This processing occurs on a dedicated Linux server on the network
The CUDA Multi-View Stereo system converts the 2D images into a 3D model

This multi-view generation leverages Gemini's object consistency capabilities to create coherent representations of the 3D object from different angles while maintaining the same style, colors, and proportions as the original Venice AI image.

Future Enhancements

Potential future improvements include:

Persistent Storage: Save approved images to a database
Image Editing: Allow users to request specific modifications to generated images
Multiple Image Generation: Generate several variations at once for the user to choose from
Additional Views: Generate more angles beyond the four cardinal directions

Venice AI Integration

The server integrates with Venice AI's image generation API, which provides high-quality image generation capabilities. The API allows for:

Generating images from text prompts
Customizing image dimensions
Adjusting generation parameters
Using different models for different styles

Getting Started

To implement this server, you would need to:

Install the FastMCP library
Set up Venice AI API credentials
Implement the MCP tools as described
Run the server and connect it to an LLM host

MCP Resources

For more information about the Model Context Protocol and how to build MCP servers, check out these resources:

MCP Introduction - Official introduction to the Model Context Protocol
MCP SDKs - Official SDKs for Python, TypeScript, Java, and Kotlin
MCP GitHub Repository - Official MCP implementation and examples
Building MCP with LLMs - Tutorial on using LLMs to build MCP servers
Example Servers - Gallery of official MCP server implementations
MCP Inspector - Interactive debugging tool for MCP servers

Recommended Servers

Crypto Price & Market Analysis MCP Server

A Model Context Protocol (MCP) server that provides comprehensive cryptocurrency analysis using the CoinCap API. This server offers real-time price data, market analysis, and historical trends through an easy-to-use interface.

Featured

TypeScript

MCP PubMed Search

Server to search PubMed (PubMed is a free, online database that allows users to search for biomedical and life sciences literature). I have created on a day MCP came out but was on vacation, I saw someone post similar server in your DB, but figured to post mine.

Featured

Python

dbt Semantic Layer MCP Server

A server that enables querying the dbt Semantic Layer through natural language conversations with Claude Desktop and other AI assistants, allowing users to discover metrics, create queries, analyze data, and visualize results.

Featured

TypeScript

mixpanel

Connect to your Mixpanel data. Query events, retention, and funnel data from Mixpanel analytics.

Featured

TypeScript

Sequential Thinking MCP Server

This server facilitates structured problem-solving by breaking down complex issues into sequential steps, supporting revisions, and enabling multiple solution paths through full MCP integration.

Featured

Python

Nefino MCP Server

Provides large language models with access to news and information about renewable energy projects in Germany, allowing filtering by location, topic (solar, wind, hydrogen), and date range.

Official

Python

Vectorize

Vectorize MCP server for advanced retrieval, Private Deep Research, Anything-to-Markdown file extraction and text chunking.

Official

JavaScript

Mathematica Documentation MCP server

A server that provides access to Mathematica documentation through FastMCP, enabling users to retrieve function documentation and list package symbols from Wolfram Mathematica.

Local

Python

kb-mcp-server

An MCP server aimed to be portable, local, easy and convenient to support semantic/graph based retrieval of txtai "all in one" embeddings database. Any txtai embeddings db in tar.gz form can be loaded

Local

Python

Research MCP Server

The server functions as an MCP server to interact with Notion for retrieving and creating survey data, integrating with the Claude Desktop Client for conducting and reviewing surveys.

Local

Python