MCP Catalogs
HomeDINO-X-MCP screenshot

DINO-X-MCP

by IDEA-Research·114·Score 44

DINO-X MCP server enables LLMs with visual perception through image object detection, localization, and captioning APIs.

ai-llmmediadeveloper-tools
9
Forks
2
Open issues
7 mo ago
Last commit
2d ago
Indexed

Overview

DINO-X MCP is an official server from IDEA-Research that brings fine-grained object detection and image understanding to multimodal applications. It supports multiple transport modes including STDIO and Streamable HTTP, with features like full-scene object detection, text-prompted object detection, human pose estimation, and visualization capabilities. The server provides structured outputs with object categories, counts, locations, and attributes for VQA and multi-step reasoning tasks.

Try asking AI

After installing, here are 5 things you can ask your AI assistant:

you:Visual content analysis with object detection and localization
you:Inventory management through automated object counting
you:Human pose estimation for fitness and healthcare applications
you:What transport modes are supported?
you:What image formats are supported?

When to choose this

Choose DINO-X MCP when you need fine-grained visual understanding with structured outputs for multimodal AI applications, especially when working with object detection, localization, and image captioning tasks.

When NOT to choose this

Avoid if you need local-only processing without API dependency, or if you require processing image files via local paths in HTTP mode.

Tools this server exposes

4 tools extracted from the README
  • detect-all-objects

    Full-scene object detection to identify all objects in an image

  • detect-objects-by-text

    Text-prompted object detection to find specific objects based on text input

  • detect-human-pose-keypoints

    Human pose estimation to detect 17 keypoints in human figures

  • visualize-detection-result

    Create visualization of detection results with bounding boxes and labels

Comparable tools

clip-mcpvision-mcpimage-analysis-tools

Installation

Installation

Option A: Official Hosted Streamable HTTP (Recommended)

Add to your MCP client config:

{
  "mcpServers": {
    "dinox-mcp": {
      "url": "https://mcp.deepdataspace.com/mcp?key=your-api-key"
    }
  }
}

Option B: Use the NPM package locally (STDIO)

Install Node.js first, then configure:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "npx",
      "args": ["-y", "@deepdataspace/dinox-mcp"],
      "env": {
        "DINOX_API_KEY": "your-api-key-here",
        "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
      }
    }
  }
}

FAQ

What transport modes are supported?
DINO-X MCP supports STDIO (default) and Streamable HTTP modes. STDIO supports both file:// and https:// image sources, while HTTP supports https:// only.
What image formats are supported?
The server supports jpg, jpeg, webp, and png image formats.

Compare DINO-X-MCP with

GitHub →

Last updated · Auto-generated from public README + GitHub signals.