MCP Catalogs
Home

luma-mcp

by JochenYang·59·Score 46

Multi-Model Visual Understanding MCP Server providing image analysis capabilities to AI coding models that don't support native vision.

ai-llmdeveloper-toolsother
8
Forks
1
Open issues
1 mo ago
Last commit
2d ago
Indexed

Overview

Luma MCP is a versatile visual understanding server that integrates multiple vision models including GLM-4.6V, DeepSeek-OCR, Qwen3-VL-Flash, Doubao-Seed-1.6, and Hunyuan-Vision-1.5. It provides a unified 'image_understand' tool that processes images from local files, remote URLs, and Data URIs through a standardized preprocessing pipeline. The server is optimized for complex screenshots, supporting large image cropping and high-fidelity processing for text-dense scenarios.

Try asking AI

After installing, here are 5 things you can ask your AI assistant:

you:Code analysis by providing screenshots to AI coding assistants
you:UI/UX evaluation through visual interface analysis
you:Error debugging using error message screenshots
you:Which vision models are supported?
you:What image formats are supported?

When to choose this

Choose Luma MCP when you need visual understanding capabilities for AI coding models that don't natively support image processing, especially when working with code screenshots, UI designs, or document images.

When NOT to choose this

Avoid if you need model-specific features not supported by Luma's unified approach, or if you require handling images larger than 10MB with complex processing needs.

Tools this server exposes

1 tool extracted from the README
  • image_understandimage_understand({image_source: string, prompt: string})

    Analyze images from local files, URLs, or Data URIs based on user prompts

Comparable tools

vision-mcpimage-analyzer-mcpmultimodal-mcp

Installation

Installation

npx -y luma-mcp

Claude Desktop Configuration

Add to your config.json:

{
  "mcpServers": {
    "luma": {
      "command": "npx",
      "args": ["-y", "luma-mcp"],
      "env": {
        "MODEL_PROVIDER": "zhipu",
        "ZHIPU_API_KEY": "your-api-key"
      }
    }
  }
}

Replace MODEL_PROVIDER and corresponding key with your chosen provider.

FAQ

Which vision models are supported?
GLM-4.6V, DeepSeek-OCR, Qwen3-VL-Flash, Doubao-Seed-1.6, and Hunyuan-Vision-1.5
What image formats are supported?
JPG, PNG, WebP, GIF with a maximum input size of 10MB

Compare luma-mcp with

GitHub →

Last updated · Auto-generated from public README + GitHub signals.