返回顶部
🇺🇸 English
🇨🇳 简体中文
🇨🇳 繁體中文
🇺🇸 English
🇯🇵 日本語
🇰🇷 한국어
🇫🇷 Français
🇩🇪 Deutsch
🇪🇸 Español
🇷🇺 Русский
v

video-understanding

>

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.1.0
安全检测
已通过
900
下载量
免费
免费
4
收藏
概述
安装方式
版本历史

video-understanding

# Video Understanding (Gemini) Analyze videos using Google Gemini's multimodal video understanding. Supports 1000+ video sources via yt-dlp. ## Requirements - `yt-dlp` — `brew install yt-dlp` / `pip install yt-dlp` - `ffmpeg` — `brew install ffmpeg` (for merging video+audio streams) - `GEMINI_API_KEY` environment variable ## Default Output Returns structured JSON: - **transcript** — Verbatim transcript with `[MM:SS]` timestamps - **description** — Visual description (people, setting, UI, text on screen, flow) - **summary** — 2-3 sentence summary - **duration_seconds** — Estimated duration - **speakers** — Identified speakers ## Usage ### Analyze a video (structured JSON output) ```bash uv run {baseDir}/scripts/analyze_video.py "<video-url>" ``` ### Ask a question (adds "answer" field) ```bash uv run {baseDir}/scripts/analyze_video.py "<video-url>" -q "What product is shown?" ``` ### Override prompt entirely ```bash uv run {baseDir}/scripts/analyze_video.py "<video-url>" -p "Custom prompt" --raw ``` ### Download only (no analysis) ```bash uv run {baseDir}/scripts/analyze_video.py "<video-url>" --download-only -o video.mp4 ``` ## Options | Flag | Description | Default | |------|-------------|---------| | `-q` / `--question` | Question to answer (added to default fields) | none | | `-p` / `--prompt` | Override entire prompt (ignores -q) | structured JSON | | `-m` / `--model` | Gemini model | gemini-2.5-flash | | `-o` / `--output` | Save output to file | stdout | | `--keep` | Keep downloaded video file | false | | `--download-only` | Download only, skip analysis | false | | `--max-size` | Max file size in MB | 500 | | `--raw` | Raw text output instead of JSON | false | ## How It Works 1. **YouTube URLs** → Passed directly to Gemini (no download needed) 2. **All other URLs** → Downloaded via yt-dlp → uploaded to Gemini File API → poll until processed 3. Gemini analyzes video with structured prompt → returns JSON 4. Temp files and Gemini uploads cleaned up automatically ## Supported Sources Any URL supported by [yt-dlp](https://github.com/yt-dlp/yt-dlp/blob/master/supportedsites.md): Loom, YouTube, TikTok, Vimeo, Twitter/X, Instagram, Dailymotion, Twitch, and 1000+ more. ## Tips - Use `-q` for targeted questions on top of the full analysis - YouTube is fastest (no download step) - Large videos (10min+) work fine — Gemini File API supports up to 2GB (free) / 20GB (paid) - The script auto-installs Python dependencies via `uv`

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 video-understanding-1776310805 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 video-understanding-1776310805 技能

通过命令行安装

skillhub install video-understanding-1776310805

下载

⬇ 下载 video-understanding v1.1.0(免费)

文件大小: 5.25 KB | 发布时间: 2026-4-16 17:23

v1.1.0 最新 2026-4-16 17:23
Added proper metadata: declared yt-dlp, ffmpeg, and GEMINI_API_KEY requirements in frontmatter.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部