Category: provider

Model Studio CosyVoice Voice Clone

Use the CosyVoice voice enrollment API to create cloned voices from public reference audio.

Critical model names

Use model="voice-enrollment" and one of these target_model values:

- INLINECODE2
INLINECODE3
INLINECODE4
INLINECODE5
INLINECODE6

Recommended default in this repo:

- INLINECODE7

Region and compatibility

- cosyvoice-v3.5-plus and cosyvoice-v3.5-flash are available only in China mainland deployment mode (Beijing endpoint).
In international deployment mode (Singapore endpoint), cosyvoice-v3-plus and cosyvoice-v3-flash do not support voice clone/design.
The target_model used during enrollment must match the model used later in speech synthesis, otherwise synthesis fails.

Endpoint

- Domestic: INLINECODE13
International: INLINECODE14

Prerequisites

- Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.
Provide a public audio URL for the enrollment sample.

Normalized interface (cosyvoice.voice_clone)

Request

- model (string, optional): fixed to INLINECODE19
INLINECODE20 (string, optional): default INLINECODE21
INLINECODE22 (string, required): letters/digits only, max 10 chars
INLINECODE23 (string, required): public audio URL
INLINECODE24 (array[string], optional): only first item is used
INLINECODE25 (float, optional): only for cosyvoice-v3.5-plus, cosyvoice-v3.5-flash, INLINECODE28
INLINECODE29 (bool, optional): only for cosyvoice-v3.5-plus, cosyvoice-v3.5-flash, INLINECODE32

Response

- voice_id (string): use this as the voice parameter in later TTS calls
INLINECODE35 (string)
INLINECODE36 (number, optional)

Operational guidance

- For Chinese dialect reference audio, keep language_hints=["zh"]; control dialect style later in synthesis via text or instruct.
For cosyvoice-v3.5-plus, supported language_hints include zh, en, fr, de, ja, ko, ru, pt, th, id, vi.
Avoid frequent enrollment calls; each call creates a new custom voice and consumes quota.

Local helper script

Prepare a normalized request JSON:

CODEBLOCK0

Validation

CODEBLOCK1

Pass criteria: command exits 0 and output/aliyun-cosyvoice-voice-clone/validate.txt is generated.

Output And Evidence

- Save artifacts, command outputs, and API response summaries under output/aliyun-cosyvoice-voice-clone/.
Include target_model, prefix, and sample URL in the evidence file.

References

- INLINECODE56
INLINECODE57

技能名称: aliyun-cosyvoice-voice-clone
详细描述:
类别: 提供者

模型工作室 CosyVoice 声音克隆

使用 CosyVoice 声音注册 API，从公开参考音频创建克隆声音。

关键模型名称

使用 model=voice-enrollment 和以下 target_model 值之一：

- cosyvoice-v3.5-plus
cosyvoice-v3.5-flash
cosyvoice-v3-plus
cosyvoice-v3-flash
cosyvoice-v2

本仓库推荐默认值：

- target_model=cosyvoice-v3.5-plus

区域与兼容性

- cosyvoice-v3.5-plus 和 cosyvoice-v3.5-flash 仅在中国大陆部署模式（北京端点）下可用。
在国际部署模式（新加坡端点）下，cosyvoice-v3-plus 和 cosyvoice-v3-flash 不支持声音克隆/设计。
注册时使用的 target_model 必须与后续语音合成中使用的模型一致，否则合成会失败。

端点

- 国内：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
国际：https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

前提条件

- 在环境中设置 DASHSCOPEAPIKEY，或将 dashscopeapikey 添加到 ~/.alibabacloud/credentials。
为注册样本提供公开音频 URL。

标准化接口 (cosyvoice.voice_clone)

请求

- model (字符串，可选)：固定为 voice-enrollment
targetmodel (字符串，可选)：默认为 cosyvoice-v3.5-plus
prefix (字符串，必填)：仅限字母/数字，最多 10 个字符
voicesampleurl (字符串，必填)：公开音频 URL
languagehints (字符串数组，可选)：仅使用第一个元素
maxpromptaudiolength (浮点数，可选)：仅适用于 cosyvoice-v3.5-plus、cosyvoice-v3.5-flash、cosyvoice-v3-flash
enablepreprocess (布尔值，可选)：仅适用于 cosyvoice-v3.5-plus、cosyvoice-v3.5-flash、cosyvoice-v3-flash

响应

- voiceid (字符串)：在后续 TTS 调用中用作 voice 参数
requestid (字符串)
usage.count (数字，可选)

操作指南

- 对于中文方言参考音频，保持 languagehints=[zh]；在合成时通过文本或 instruct 控制方言风格。
对于 cosyvoice-v3.5-plus，支持的 languagehints 包括 zh、en、fr、de、ja、ko、ru、pt、th、id、vi。
避免频繁调用注册接口；每次调用都会创建新的自定义声音并消耗配额。

本地辅助脚本

准备标准化请求 JSON：

bash
python skills/ai/audio/aliyun-cosyvoice-voice-clone/scripts/preparecosyvoiceclone_request.py \
--target-model cosyvoice-v3.5-plus \
--prefix myvoice \
--voice-sample-url https://example.com/voice.wav \
--language-hint zh

验证

bash
mkdir -p output/aliyun-cosyvoice-voice-clone
for f in skills/ai/audio/aliyun-cosyvoice-voice-clone/scripts/*.py; do
python3 -m py_compile $f
done
echo pycompileok > output/aliyun-cosyvoice-voice-clone/validate.txt

通过标准：命令退出码为 0 且生成 output/aliyun-cosyvoice-voice-clone/validate.txt。

输出与证据

- 将产物、命令输出和 API 响应摘要保存在 output/aliyun-cosyvoice-voice-clone/ 下。
在证据文件中包含 target_model、prefix 和样本 URL。

参考

- references/api_reference.md
references/sources.md

aliyun-cosyvoice-voice-clone阿里云声音克隆

aliyun-cosyvoice-voice-clone

Model Studio CosyVoice Voice Clone

Critical model names

Region and compatibility

Endpoint

Prerequisites

Normalized interface (cosyvoice.voice_clone)

Request

Response

Operational guidance

Local helper script

Validation

Output And Evidence

References

模型工作室 CosyVoice 声音克隆

关键模型名称

区域与兼容性

端点

前提条件

标准化接口 (cosyvoice.voice_clone)

请求

响应

操作指南

本地辅助脚本

验证

输出与证据

参考

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

aliyun-cosyvoice-voice-clone阿里云声音克隆

aliyun-cosyvoice-voice-clone

Model Studio CosyVoice Voice Clone

Critical model names

Region and compatibility

Endpoint

Prerequisites

Normalized interface (cosyvoice.voice_clone)

Request

Response

Operational guidance

Local helper script

Validation

Output And Evidence

References

模型工作室 CosyVoice 声音克隆

关键模型名称

区域与兼容性

端点

前提条件

标准化接口 (cosyvoice.voice_clone)

请求

响应

操作指南

本地辅助脚本

验证

输出与证据

参考

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement