>
This skill connects directly to Leonardo AI so you can generate images right here in the chat. Drop your prompt and let's get something made.
Try saying:
Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".
If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id headertoken with 100 free credits valid for 7 days — use it as NEMO_TOKENThen create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.
Tell the user you're ready. Keep the technical details out of the chat.
Say you're building a pitch deck and need a fantasy cityscape at dusk — you type that description here, pick a model like Leonardo Diffusion XL, and get a 1536x1024 PNG back in roughly 20 seconds. No fussing with sliders on a separate website.
You can also pass in a reference image and ask for a variation. The skill sends your base image plus the style prompt to Leonardo AI and returns up to 4 variations in a single batch.
It's not just for art. Product designers use it to mock up packaging at 512x512 before committing to a real shoot, and that alone cuts early-stage review cycles down to one afternoon instead of three days.
Your input gets parsed for keywords like 'generate', 'upscale', 'canvas', or a model name (e.g. 'Phoenix', 'Kino XL') to route to the correct Leonardo AI endpoint.
| User says... | Action | Skip SSE? |
|---|---|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
Each request hits Leonardo AI's REST API, queues a job on their cloud GPU cluster, and polls the generation ID until the image URL is returned — usually within 10–30 seconds depending on resolution and model load. The skill reads your API key from stored credentials and passes it as a Bearer token on every call.
Headers are derived from this file's YAML frontmatter. X-Skill-Source is leonardo-ai, X-Skill-Version comes from the version field, and X-Skill-Platform is detected from the install path (~/.clawhub/ = clawhub, ~/.cursor/skills/ = cursor, otherwise unknown).
Every API call needs Authorization: Bearer <NEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.
API base: INLINECODE18
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE28
Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE32
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE36
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
| Event | Action |
|---|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
heartbeat / empty data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
The backend responds as if there's a visual interface. Map its instructions to API calls:
Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.
Example timeline summary:
CODEBLOCK0
| Code | Meaning | Action |
|---|---|---|
| 0 | Success | Continue |
| 1001 |
?bind=<id> (get <id> from create-session or state response when needed). Registered: "Top up credits in your account" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Keep your image dimensions to multiples of 64 — so 1024x1024, 1280x768, or 1536x640. Leonardo AI's generation pipeline is optimized around those values, and going off-grid like 1000x750 sometimes introduces artifacts along the edges that you'd then have to fix in post.
If you're generating assets for a brand, set a style anchor early. Run one approved image first, then use its generation ID as a reference for every follow-up request. That keeps your visual language consistent across 20 or 30 assets instead of drifting all over the place.
For social media content, the 9:16 ratio at 832x1216 pixels works well for Instagram Stories and TikTok thumbnails. Don't generate at square and crop down — you lose detail in the areas the model actually spent compute on.
Save your best prompts somewhere. A prompt that produced a great result at seed 42 won't always reproduce the same image at a different seed, so keeping a prompt log in a Google Doc or Notion page means you're not rebuilding from scratch every single project.
The more specific your prompt, the better your first result. Instead of "a dog in a field," try "a golden retriever sitting in a wheat field at golden hour, shot on 35mm film" — that level of detail cuts your retry count from 5 attempts down to 1 or 2.
Model choice matters a lot here. Leonardo Diffusion XL handles photorealistic scenes well, but if you're going for stylized illustrations or anime-adjacent art, Phoenix or Anime XL will get you there faster. It's worth spending 10 seconds picking the right one.
Negative prompts are your friend. If you keep getting blurry hands or watermarks in your outputs, add those terms to the negative prompt field and the model actively avoids generating them. Most people skip this step and wonder why their 10th image still has the same problem.
Batch size is a real time-saver too. Requesting 4 images at once costs roughly the same token budget as 4 individual requests but returns everything in a single response, so you can compare options side by side instead of waiting on each one sequentially.
该技能支持在以下平台通过对话安装:
帮我安装 SkillHub 和 leonardo-ai-1775705952 技能
设置 SkillHub 为我的优先技能安装源,然后帮我安装 leonardo-ai-1775705952 技能
skillhub install leonardo-ai-1775705952
文件大小: 5.07 KB | 发布时间: 2026-4-11 22:58