local_ai_gen

local_ai_gen

Local AI generation for images, audio, speech, and 3D models using open source models.

Category
Visit Server

README

Local AI Generation MCP Server (local_ai_gen)

This project runs a local MCP (Model Context Protocol) server that exposes tools for:

  1. Text to Image
  2. Text to Music / Audio
  3. Text to Speech
  4. Image/Text to 3D Model

Models Used

  • Image Generation: segmind/SSD-1B A fast, distilled version of SDXL that works well on consumer GPUs.
  • Music Generation: stabilityai/stable-audio-open-1.0 A high-quality open model for generating sound effects, short music tracks, and ambient audio. (Note: Requires accepting license terms)
  • Speech Generation: Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice A robust, multilingual text-to-speech model.
  • 3D Generation: stabilityai/TripoSR A fast feed-forward model for image-to-3D reconstruction.

Requirements

Complete Setup Guide

1. Preparation and Hugging Face Authenication

The Stable Audio Open model is gated and requires you to accept its license before downloading.

  1. Go to stabilityai/stable-audio-open-1.0 on Hugging Face.
  2. Log in and click to Accept the Terms/License.
  3. Generate a User Access Token in your Hugging Face Settings.
  4. Run the Hugging Face CLI login locally:
    pip install -U "huggingface_hub"
    hf auth
    
    (Paste your token when prompted. You do not need to add it as a git credential).

2. Environment Installation

python -m venv .venv
source .venv/bin/activate
python setup.py

What python setup.py does:

  1. Creates output directories (generated_images, generated_audio, generated_models, models)
  2. Clones third_party/TripoSR if it is missing
  3. Installs everything from requirements.txt
  4. Compiles and installs torchmcubes against your active torch version (needed for 3D generation)
  5. Installs flash-attn.

Run the MCP server

To run the MCP server manually (via standard I/O):

python mcp_server/main.py

Note: The first run of any specific tool will be slower because it has to download the weights for that model into your Hugging Face cache.

Smoke Test

You can run the included smoke test script to verify all models are working correctly:

python smoke_test_generate.py

MCP Tools Exposed

  • generate_image
  • generate_audio
  • generate_speech
  • generate_3d_model
  • health_check

Notes

  • Generated files are written to generated_images, generated_audio, and generated_models by default.
  • For generate_audio and generate_speech, you can override the destination with:
  • tool arg output_dir (highest priority), or
  • env var GENAI_OUTPUT_AUDIO_DIR
  • For generate_image, override destination with tool arg output_dir or env var GENAI_OUTPUT_IMAGE_DIR
  • For generate_3d_model, override destination with tool arg output_dir or env var GENAI_OUTPUT_MODEL_DIR

Using with MCP Clients (Cursor, Claude Desktop, etc.)

To use this server in an MCP-compatible client, add the following to your mcp.json (or the respective MCP configuration file for your client). Make sure to replace <YOUR_PROJECT_PATH> with the absolute path to where you cloned this repository:

{
  "mcpServers": {
    "local_ai_gen": {
      "command": "<YOUR_PROJECT_PATH>/.venv/bin/python",
      "args": [
        "<YOUR_PROJECT_PATH>/mcp_server/main.py"
      ],
      "env": {}
    }
  }
}

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured