aliyun-wan-digital-human阿里云数字人

Use when generating talking, singing, or presentation videos from a single character image and audio with Alibaba Cloud Model Studio digital-human model `wan2.2-s2v`. Use when creating narrated avatar videos, singing portraits, or broadcast-style talking-head clips.

作者: admin | 来源: ClawHub

Category: provider

Model Studio Digital Human

Validation

CODEBLOCK0

Pass criteria: command exits 0 and output/aliyun-wan-digital-human/validate.txt is generated.

Output And Evidence

- Save normalized request payloads, chosen resolution, and task polling snapshots under output/aliyun-wan-digital-human/.
Record image/audio URLs and whether the input image passed detection.

Use this skill for image + audio driven speaking, singing, or presenting characters.

Critical model names

Use these exact model strings:

- INLINECODE2
INLINECODE3

Selection guidance:

- Run wan2.2-s2v-detect first to validate the image.
Use wan2.2-s2v for the actual video generation job.

Prerequisites

- China mainland (Beijing) only.
Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.
Input audio should contain clear speech or singing, and input image should depict a clear subject.

Normalized interface (video.digital_human)

Detect Request

- model (string, optional): default INLINECODE10
INLINECODE11 (string, required)

Generate Request

- model (string, optional): default INLINECODE13
INLINECODE14 (string, required)
INLINECODE15 (string, required)
INLINECODE16 (string, optional): 480P or INLINECODE18
INLINECODE19 (string, optional): talk, sing, or INLINECODE22

Response

- task_id (string)
INLINECODE24 (string)
INLINECODE25 (string, when finished)

Quick start

CODEBLOCK1

Operational guidance

- Use a portrait, half-body, or full-body image with a clear face and stable framing.
Match audio length to the desired output duration; the output follows the audio length up to the model limit.
Keep image and audio as public HTTP/HTTPS URLs.
If the image fails detection, do not proceed directly to video generation.

Output location

- Default output: INLINECODE26
Override base dir with OUTPUT_DIR.

References

- INLINECODE28