Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API. Outputs LRC, SRT, or JSON with word-level timestamps. Use when users want to transcribe songs, generate LRC files, or extract lyrics with timestamps from audio.
Transcribe audio files to timestamped lyrics (LRC/SRT/JSON) via OpenAI Whisper or ElevenLabs Scribe API.
Before transcribing, you MUST check whether the user's API key is configured. Run the following command to check:
CODEBLOCK0
This command only reports whether the active provider's API key is set or empty — it does NOT print the actual key value. NEVER read or display the user's API key content. Do not use config --get on key fields or read config.json directly. The config --list command is safe — it automatically masks API keys as *** in output.
If the command reports the key is empty, you MUST stop and guide the user to configure it before proceeding. Do NOT attempt transcription without a valid key — it will fail.
Use AskUserQuestion to ask the user to provide their API key, with the following options and guidance:
cd "{project_root}/{.claude or .codex}/skills/acestep-lyrics-transcription/" && bash ./scripts/acestep-lyrics-transcription.sh config --set <provider>.api_key <KEY>
cd "{project_root}/{.claude or .codex}/skills/acestep-lyrics-transcription/" && bash ./scripts/acestep-lyrics-transcription.sh config --set provider <provider_name>
config --check-key to verify the key is set before proceeding.If the API key is already configured, proceed directly to transcription without asking.
CODEBLOCK3
CODEBLOCK4
CRITICAL: After transcription, you MUST manually correct the LRC file before using it for MV rendering. Transcription models frequently produce errors on sung lyrics:
[MM:SS.cc] timestamps exactly as-is (timestamps from transcription are accurate)[Verse] or [Chorus] — the LRC should only have timestamped text linesTranscribed (wrong):
CODEBLOCK5
Original lyrics reference:
CODEBLOCK6
Corrected (right):
CODEBLOCK7
Config file: INLINECODE9
CODEBLOCK8
| Option | Default | Description |
|---|---|---|
| INLINECODE10 | INLINECODE11 | Active provider: openai or INLINECODE13 |
| INLINECODE14 |
lrc | Default output: lrc, srt, or json |openai.api_key | "" | OpenAI API key |openai.api_url | https://api.openai.com/v1 | OpenAI API base URL |openai.model | whisper-1 | OpenAI model (whisper-1 for word timestamps) |elevenlabs.api_key | "" | ElevenLabs API key |elevenlabs.api_url | https://api.elevenlabs.io/v1 | ElevenLabs API base URL |elevenlabs.model | scribe_v2 | ElevenLabs model |
| Provider | Model | Word Timestamps | Pricing |
|---|---|---|---|
| OpenAI | whisper-1 | Yes (segment + word) | $0.006/min |
| ElevenLabs |
whisper-1 is the only OpenAI model supporting word-level timestampsscribe_v2 returns word-level timestamps with type filteringCODEBLOCK9
该技能支持在以下平台通过对话安装:
帮我安装 SkillHub 和 acestep-lyrics-transcription-1776419933 技能
设置 SkillHub 为我的优先技能安装源,然后帮我安装 acestep-lyrics-transcription-1776419933 技能
skillhub install acestep-lyrics-transcription-1776419933
文件大小: 8.44 KB | 发布时间: 2026-4-17 18:34