视频理解
准备工作
视频输入
generateContent
发出请求。对于大于 20MB 的文件、时长超过大约 1 分钟的视频,或者您想在多个请求中重复使用文件时,请使用此方法。generateContent
。适用于文件较小(小于 20 MB)且时长较短的视频。上传视频文件
generateContent
请求中使用文件引用。VIDEO_PATH="path/to/sample.mp4"
MIME_TYPE=$(file -b --mime-type "${VIDEO_PATH}")
NUM_BYTES=$(wc -c < "${VIDEO_PATH}")
DISPLAY_NAME=VIDEO
tmp_header_file=upload-header.tmp
echo "Starting file upload..."
curl "https://generativelanguage.googleapis.com/upload/v1beta/files?key=${GOOGLE_API_KEY}" \
-D ${tmp_header_file} \
-H "X-Goog-Upload-Protocol: resumable" \
-H "X-Goog-Upload-Command: start" \
-H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
-H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \
-H "Content-Type: application/json" \
-d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null
upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r")
rm "${tmp_header_file}"
echo "Uploading video data..."
curl "${upload_url}" \
-H "Content-Length: ${NUM_BYTES}" \
-H "X-Goog-Upload-Offset: 0" \
-H "X-Goog-Upload-Command: upload, finalize" \
--data-binary "@${VIDEO_PATH}" 2> /dev/null > file_info.json
file_uri=$(jq -r ".file.uri" file_info.json)
echo file_uri=$file_uri
echo "File uploaded successfully. File URI: ${file_uri}"
# --- 3. Generate content using the uploaded video file ---
echo "Generating content from video..."
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts":[
{"file_data":{"mime_type": "'"${MIME_TYPE}"'", "file_uri": "'"${file_uri}"'"}},
{"text": "Summarize this video. Then create a quiz with an answer key based on the information in this video."}]
}]
}' 2> /dev/null > response.json
jq -r ".candidates[].content.parts[].text" response.json
以内嵌方式传递视频数据
generateContent
,而无需使用 File API 上传视频文件。这适用于总请求大小小于 20MB 的较短视频。Argument list too long
错误,则表示文件的 base64 编码可能太长,无法在 curl 命令行中使用。对于较大的文件,请改用 File API 方法。VIDEO_PATH=/path/to/your/video.mp4
if [[ "$(base64 --version 2>&1)" = *"FreeBSD"* ]]; then
B64FLAGS="--input"
else
B64FLAGS="-w0"
fi
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts":[
{
"inline_data": {
"mime_type":"video/mp4",
"data": "'$(base64 $B64FLAGS $VIDEO_PATH)'"
}
},
{"text": "Please summarize the video in 3 sentences."}
]
}]
}' 2> /dev/null
添加 YouTube 网址
Part
。您可以添加 YouTube 网址,并附上提示,要求模型总结、翻译或以其他方式与视频内容互动。
提及内容中的时间戳
MM:SS
的时间戳,询问视频中特定时间点的内容。PROMPT="What are the examples given at 00:05 and 00:10 supposed to show us?"
转写视频并提供视觉描述
PROMPT="Transcribe the audio from this video, giving timestamps for salient events in the video. Also provide visual descriptions."
支持的视频格式
video/mp4
video/mpeg
video/mov
video/avi
video/x-flv
video/mpg
video/webm
video/wmv
video/3gpp
视频的技术详情
MM:SS
格式(例如,01:15
表示 1 分 15 秒)。contents
数组中的视频部分后面。后续步骤
修改于 2025-04-23 06:33:45