
claude-video-vision
by jordanrendric·★ 613·Score 53
Claude Code plugin that gives Claude the ability to watch and understand videos through frame extraction and audio analysis.
Overview
Claude-video-vision is a sophisticated MCP server that extends Claude's capabilities by adding multimodal video perception. It processes video files and YouTube URLs by extracting frames via ffmpeg and analyzing audio through multiple backends (Gemini API, local Whisper, or OpenAI API). The plugin serves as a perception layer, providing Claude with visual frames and timestamped audio transcriptions without performing interpretation itself. It features adaptive extraction based on context, supports multiple backends with different cost and privacy profiles, and includes an interactive setup wizard for easy configuration.
Try asking AI
After installing, here are 5 things you can ask your AI assistant:
When to choose this
Choose this MCP server when you need Claude to analyze video content with both visual frames and audio transcription, especially for tutorials, presentations, or recordings where multimodal understanding is essential.
When NOT to choose this
Avoid this if you need high-frequency video processing with proprietary data, as it lacks authentication and may expose API keys; also consider alternatives if you need to process videos in bulk.
Tools this server exposes
6 tools extracted from the READMEvideo_watchExtract frames and process audio from videos
video_analyzeAnalyze video structure with ffmpeg filters before extraction
video_detailDrill into specific cached or newly extracted moments
video_infoGet video metadata without processing
video_configureChange settings for video processing
video_setupCheck and guide dependency installation
Comparable tools
Installation
Claude Desktop Installation
- Open Claude Desktop
- Go to Settings → Plugins
- Add marketplace repository:
https://github.com/jordanrendric/claude-video-vision - Install the plugin
Alternative Installation (via CLI)
# In Claude Code
/plugin marketplace add https://github.com/jordanrendric/claude-video-vision
/plugin install claude-video-visionDependencies
- Node.js 20+
- ffmpeg (auto-detected, setup wizard provides instructions)
- yt-dlp (optional, for YouTube URLs)
FAQ
- What video formats are supported?
- The plugin supports all formats that ffmpeg can process, including MP4, MOV, AVI, and more. For YouTube URLs, it uses yt-dlp to download and process the video.
- Is local processing available?
- Yes, you can use whisper.cpp or Python openai-whisper for fully local audio processing with no API costs or external dependencies.
Compare claude-video-vision with
Last updated · Auto-generated from public README + GitHub signals.