MCP Catalogs
Homeclaude-video-vision screenshot

claude-video-vision

by jordanrendric·613·Score 53

Claude Code plugin that gives Claude the ability to watch and understand videos through frame extraction and audio analysis.

mediaai-llmdeveloper-tools
75
Forks
13
Open issues
this month
Last commit
2d ago
Indexed

Overview

Claude-video-vision is a sophisticated MCP server that extends Claude's capabilities by adding multimodal video perception. It processes video files and YouTube URLs by extracting frames via ffmpeg and analyzing audio through multiple backends (Gemini API, local Whisper, or OpenAI API). The plugin serves as a perception layer, providing Claude with visual frames and timestamped audio transcriptions without performing interpretation itself. It features adaptive extraction based on context, supports multiple backends with different cost and privacy profiles, and includes an interactive setup wizard for easy configuration.

Try asking AI

After installing, here are 5 things you can ask your AI assistant:

you:Video content analysis and summarization
you:Tutorial and educational content processing
you:Surveillance footage review with timestamped context
you:What video formats are supported?
you:Is local processing available?

When to choose this

Choose this MCP server when you need Claude to analyze video content with both visual frames and audio transcription, especially for tutorials, presentations, or recordings where multimodal understanding is essential.

When NOT to choose this

Avoid this if you need high-frequency video processing with proprietary data, as it lacks authentication and may expose API keys; also consider alternatives if you need to process videos in bulk.

Tools this server exposes

6 tools extracted from the README
  • video_watch

    Extract frames and process audio from videos

  • video_analyze

    Analyze video structure with ffmpeg filters before extraction

  • video_detail

    Drill into specific cached or newly extracted moments

  • video_info

    Get video metadata without processing

  • video_configure

    Change settings for video processing

  • video_setup

    Check and guide dependency installation

Comparable tools

mcp-youtubemcp-media-analyzermcp-ffmpegvideo-mcp

Installation

Claude Desktop Installation

  1. Open Claude Desktop
  2. Go to Settings → Plugins
  3. Add marketplace repository: https://github.com/jordanrendric/claude-video-vision
  4. Install the plugin

Alternative Installation (via CLI)

# In Claude Code
/plugin marketplace add https://github.com/jordanrendric/claude-video-vision
/plugin install claude-video-vision

Dependencies

  • Node.js 20+
  • ffmpeg (auto-detected, setup wizard provides instructions)
  • yt-dlp (optional, for YouTube URLs)

FAQ

What video formats are supported?
The plugin supports all formats that ffmpeg can process, including MP4, MOV, AVI, and more. For YouTube URLs, it uses yt-dlp to download and process the video.
Is local processing available?
Yes, you can use whisper.cpp or Python openai-whisper for fully local audio processing with no API costs or external dependencies.

Compare claude-video-vision with

GitHub →

Last updated · Auto-generated from public README + GitHub signals.