UNITH Digital Humans Skill

Create, configure, update, and deploy AI-powered Digital Human avatars using the UNITH API.

Quick Overview

UNITH digital humans are AI avatars that can speak, converse, and interact with users. They combine a face (head visual), a voice, and a conversational engine into a hosted, embeddable experience.

Base API URL: https://platform-api.unith.ai
Docs: https://docs.unith.ai

Prerequisites

The user must supply the following credentials (stored as environment variables):

Variable	Description	How to obtain
INLINECODE1	Account email	Register at https://unith.ai
INLINECODE2

Non-expiring secret key | UNITH dashboard → Manage Account → "Secret Key" section → Generate |

⚠️ The secret key is displayed only once. If lost, the user must delete and regenerate it.

Authentication

All API calls require a Bearer token (valid 7 days). Use the auth script:

CODEBLOCK0

This validates credentials, retries on network errors, and exports UNITH_TOKEN. On failure, it prints specific guidance (wrong key, expired token, etc.).

Workflow: Creating a Digital Human

Step 1: Choose an Operating Mode

Ask the user what they want the digital human to do. Map their answer to one of 5 modes:

Mode	INLINECODE4 value	Use case	Output
Text-to-Video	INLINECODE5	Generate an MP4 video of the avatar speaking provided text	MP4 file
Open Dialogue

Complexity spectrum (simple → sophisticated):

- Simplest: ttt — just text in, video out. No knowledge base needed.
Standard: oc — conversational with a system prompt. Good for general assistants.
Knowledge-grounded: doc_qa — upload documents, avatar answers from them. Best for support/FAQ.
Workflow-driven: voiceflow — structured conversation paths. Requires Voiceflow account.
Most flexible: plugin — BYO conversational engine. Maximum control.

Step 2: List Available Faces

CODEBLOCK1

Each face has an id (used as headVisualId in creation). Faces can be:

- Public: Available to all organizations
Private: Available only to the user's organization
Custom (BYOF): User uploads a video of a real person (currently managed by UNITH)

Present the available faces to the user and let them choose.

Step 3: List Available Voices

CODEBLOCK2

Voices come from providers: elevenlabs, azure, audiostack. Present options to the user. Voices have performance rankings — faster voices are better for real-time conversation.

Step 4: Create the Digital Human

Build a JSON payload file (see references/api-payloads.md for the schema per mode), then:

CODEBLOCK3

The script validates required fields, checks mode-specific requirements, retries on server errors, and prints the publicUrl on success.

Step 5 (doc_qa only): Upload Knowledge Document

For doc_qa mode, the digital human needs a knowledge document:

CODEBLOCK4

The script checks file existence/size, uses a longer timeout for uploads, and provides guidance on next steps.

Step 6: Test and Iterate

The digital human is live at the publicUrl from Step 4. The user should:

1. Visit the URL and test the conversation
Update configuration as needed (see below)

Updating a Digital Human

Use the update script to modify any parameter except the face (changing face requires creating a new head):

CODEBLOCK5

Listing Existing Digital Humans

CODEBLOCK6

Deleting a Digital Human

CODEBLOCK7

This permanently removes the digital human and cannot be undone.

Agent note: Always pass --confirm when calling this script. Without it, the script prompts for interactive input and will hang.

Embedding

Digital humans can be embedded in websites/apps. See references/embedding.md for code snippets and configuration options.

Scripts

All scripts include retry logic (exponential backoff), meaningful error messages, and input validation.

Script	Purpose
INLINECODE26	Shared utilities: retry wrapper, colored logging, error parsing
INLINECODE27

Authenticate and export UNITH_TOKEN (with 6-day token caching) |
| scripts/list-resources.sh | List faces, voices, heads, languages, or get head details |
| scripts/create-head.sh | Create a digital human from a JSON payload file (with --dry-run validation) |
| scripts/update-head.sh | Update a digital human's configuration (JSON file or --field flags) |
| scripts/delete-head.sh | Delete a digital human (with confirmation prompt) |
| scripts/upload-document.sh | Upload knowledge document to a doc_qa head |

Configuration via environment variables:

- UNITH_MAX_RETRIES — max retry attempts (default: 3)
INLINECODE38 — initial delay between retries in seconds (default: 2, doubles each retry)
INLINECODE39 — curl timeout in seconds (default: 30, 120 for uploads)
INLINECODE40 — connection timeout in seconds (default: 10)
INLINECODE41 — token cache file path (default: /tmp/.unith_token_cache, set empty to disable)

Detailed API Reference

For full payload schemas, configuration parameters, and mode-specific details:

CODEBLOCK8

Common Patterns

"I want a quick video of someone saying X" → ttt mode, minimal config
"I want a customer support avatar" → doc_qa mode with knowledge docs
"I want an AI sales rep" → oc mode with a sales personality prompt
"I want to connect my own LLM" → plugin mode with webhook URL
"I want a guided onboarding flow" → voiceflow mode with Voiceflow API key

Information to Collect from the User

Before creating, ask for:

1. Purpose / use case → determines operating mode
Face preference → list available faces for selection
Voice preference → language, accent, gender, speed priority
Alias → display name for the digital human
Language → speech recognition and UI language (e.g., en-US, es-ES)
Greeting message → initial message the avatar says
System prompt (for oc/doc_qa) → personality and behavior instructions
Knowledge documents (for doc_qa) → files to upload
Voiceflow API key (for voiceflow) → from their Voiceflow account
Plugin URL (for plugin) → webhook endpoint for their custom engine

digital-clawatar数字人创建

digital-clawatar

UNITH Digital Humans Skill

Quick Overview

Prerequisites

Authentication