aliyun-speech-transcriber
# Aliyun Speech Transcriber
Use this skill to turn externally accessible media URLs into transcript results.
## Current scope
Current implementation focuses on **DashScope file transcription** using the `paraformer-v2` model, aligned with the existing Java service pattern.
## Required environment variables
- `ASR_DASHSCOPE_API_KEY`
Fallback supported:
- `DASHSCOPE_API_KEY`
Optional:
- `ALIYUN_SPEECH_MODEL` - defaults to `paraformer-v2`
- `ALIYUN_SPEECH_LANG_HINTS` - defaults to `zh,en`
- `ALIYUN_SPEECH_POLL_SECONDS` - defaults to `5`
- `ALIYUN_SPEECH_TIMEOUT_SECONDS` - defaults to `1800`
## Inputs
Pass one or more externally accessible URLs:
```powershell
node scripts/transcribe.js --file-url "https://example.com/audio.mp3"
```
Multiple files:
```powershell
node scripts/transcribe.js --file-url "https://a.com/1.mp3" --file-url "https://a.com/2.mp3"
```
## Output
The script returns JSON with:
- `success`
- `provider`
- `engine`
- `taskId`
- `requestId`
- `results`
- `text`
`text` is a best-effort plain-text extraction from the final JSON result.
## Chaining from Qiniu
Typical workflow:
1. Use `qiniu-upload` to upload a local file.
2. Prefer a signed private URL if the domain is not anonymously readable.
3. Pass the returned URL into this skill.
## Safety rules
- Never hardcode Aliyun credentials.
- Fail fast if `DASHSCOPE_API_KEY` is missing.
- Only send URLs the user intends to transcribe.
标签
skill
ai