luma-mcp
by JochenYang·★ 59·Score 46
Multi-Model Visual Understanding MCP Server providing image analysis capabilities to AI coding models that don't support native vision.
Overview
Luma MCP is a versatile visual understanding server that integrates multiple vision models including GLM-4.6V, DeepSeek-OCR, Qwen3-VL-Flash, Doubao-Seed-1.6, and Hunyuan-Vision-1.5. It provides a unified 'image_understand' tool that processes images from local files, remote URLs, and Data URIs through a standardized preprocessing pipeline. The server is optimized for complex screenshots, supporting large image cropping and high-fidelity processing for text-dense scenarios.
Try asking AI
After installing, here are 5 things you can ask your AI assistant:
When to choose this
Choose Luma MCP when you need visual understanding capabilities for AI coding models that don't natively support image processing, especially when working with code screenshots, UI designs, or document images.
When NOT to choose this
Avoid if you need model-specific features not supported by Luma's unified approach, or if you require handling images larger than 10MB with complex processing needs.
Tools this server exposes
1 tool extracted from the READMEimage_understandimage_understand({image_source: string, prompt: string})Analyze images from local files, URLs, or Data URIs based on user prompts
Comparable tools
Installation
Installation
npx -y luma-mcpClaude Desktop Configuration
Add to your config.json:
{
"mcpServers": {
"luma": {
"command": "npx",
"args": ["-y", "luma-mcp"],
"env": {
"MODEL_PROVIDER": "zhipu",
"ZHIPU_API_KEY": "your-api-key"
}
}
}
}Replace MODEL_PROVIDER and corresponding key with your chosen provider.
FAQ
- Which vision models are supported?
- GLM-4.6V, DeepSeek-OCR, Qwen3-VL-Flash, Doubao-Seed-1.6, and Hunyuan-Vision-1.5
- What image formats are supported?
- JPG, PNG, WebP, GIF with a maximum input size of 10MB
Compare luma-mcp with
Last updated · Auto-generated from public README + GitHub signals.