返回顶部
c

caid-multi-agentCAID多智能体

Coordinate multiple sub-agents to collaboratively complete long-horizon software engineering tasks using the CAID (Centralized Asynchronous Isolated Delegation) paradigm. Use when tasks require complex multi-file edits, interdependent subtasks, parallelizable work, or when a single agent would take too long. This skill implements branch-and-merge coordination with git worktree isolation, dependency-aware task delegation, and structured integration. CRITICAL: Never use CAID as a fallback after si

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.2.0
安全检测
已通过
78
下载量
免费
免费
0
收藏
概述
安装方式
版本历史

caid-multi-agent

CAID Multi-Agent Coordination

This skill implements the Centralized Asynchronous Isolated Delegation (CAID) paradigm for coordinating multiple agents working on shared artifacts.

⚠️ CRITICAL WARNINGS FROM PAPER:

  • - Use CAID from the outset — Don't run single-agent first as fallback. Sequential strategy costs nearly 2x with minimal gain.
  • Physical worktree isolation is mandatory — Soft isolation (instruction-only) degrades performance on complex tasks.
  • Engineer limits are strict — 2 for PaperBench-style, 4 for Commit0-style, never exceed 8.
  • Higher cost/runtime trade-off — CAID improves accuracy, not speed. Integration is sequential/test-gated.

Core Principles

  1. 1. Centralized Task Delegation — A manager agent decomposes tasks into dependency-aware units
  2. Asynchronous Execution — Multiple engineer agents work concurrently
  3. Isolated Workspaces — Each agent works in its own isolated branch/worktree
  4. Structured Integration — Progress is merged via git commit/merge with test verification

When to Use This Skill

Use CAID from the outset for:

  • - Long-horizon tasks with multiple interdependent files
  • Clear dependency structure (imports, test mappings)
  • Parallelizable work exists
  • Integration can be verified by executable tests

Don't use as fallback: Running single-agent first then CAID is inefficient (cost/runtime nearly additive, minimal performance gain).

Use single-agent for:

  • - Isolated, single-file changes
  • No clear parallelization opportunities
  • Exploratory/research-oriented tasks

Coordination Workflow

0. Manager Pre-Setup (CRITICAL)

Before ANY delegation, the manager must:

  1. 1. Prepare runtime environment
- Ensure dependencies installed - Set up virtual environment
  1. 2. Organize entry points
- Create main entry files - Ensure import paths work
  1. 3. Add minimal function stubs
- Empty function definitions so imports don't fail - Type signatures if available
  1. 4. Commit to main branch
- All engineer branches created from consistent base - Without this, engineers start from divergent states

CODEBLOCK0

1. Task Analysis & Dependency Graph Creation

Manager's role: Before delegating, analyze the task structure:

  • - Identify atomic units of work (files, functions, modules)
  • Build a dependency graph: G=(V,E) where edges indicate dependencies
  • Define Ready(v) ⇔ all dependencies of v are completed
  • Only delegate tasks that are Ready (all dependencies satisfied)

Commit0-style tasks (clear file structure):

  1. 1. Check import statements to identify file-level dependencies
  2. Collect executable test cases from repository
  3. Examine which files tests exercise
  4. Identify components to implement earlier (upstream dependencies)
  5. Delegate at file level first — only split to function level if file has many unimplemented functions

PaperBench-style tasks (inferred structure):

  1. 1. Read paper to identify main contribution
  2. Infer implementation order from contribution
  3. Use max 2 engineers — manager task is harder, more agents destabilize

Dependency graph construction:
CODEBLOCK1

2. Workspace Isolation Setup

Create PHYSICALLY isolated worktrees (not soft isolation):

CODEBLOCK2

⚠️ WARNING: Soft isolation (same workspace, instruction-level constraints) degrades performance to below single-agent on PaperBench. Physical git worktree isolation is mandatory.

Key isolation principles:

  • - Each engineer operates in its own git worktree (physical filesystem isolation)
  • All worktrees are derived from the main branch
  • Engineers modify files only within their assigned workspace
  • Restricted files (shared across engineers): __init__.py, config files, global constants — engineers must NOT commit changes to these

3. Dependency-Aware Task Delegation

STRICT Engineer Limits:

Task TypeMax EngineersWhy
PaperBench-style2Inferred dependencies; more destabilizes
Commit0-style
4 | Clear file structure; test-guided |
| General SWE | 2-4 | Balance parallelism vs integration overhead |
| Absolute max | 8 | Beyond this, coordination tax exceeds gains |

⚠️ Critical: Increasing engineers beyond optimal degrades performance due to integration overhead and conflict resolution costs.

Task prioritization heuristics:
Manager should prioritize tasks that:

  1. 1. Enable earlier test execution (expose evaluation signals sooner)
  2. Lie closer to upstream of dependency chain
  3. Are simpler functions before complex ones

Round definition:

One round = complete cycle of delegation → implementation → dependency update

Recommended iteration limits (from paper experiments):

RoleMax Iterations
Manager50
Each Engineer
80 |
| Total Rounds | ~22 (varies by task) |

Delegation algorithm:

CODEBLOCK3

Task assignment JSON format (structured communication — NO free-form dialog):

CODEBLOCK4

Key: All communication uses structured JSON, not free-form dialog. This prevents inter-agent misalignment (primary failure mode in multi-agent systems).

4. Asynchronous Execution Loop

Event loop pattern:

  1. 1. Delegate → Manager assigns tasks to available engineers
  2. Execute → Engineers work concurrently in isolated worktrees
  3. Self-Verify → Engineer runs tests, fixes failures
  4. Complete → Engineer submits commit when ALL tests pass
  5. Integrate → Manager attempts merge to main
  6. Conflict Resolution (if needed) → Responsible engineer resolves
  7. Update → Manager updates dependency graph
  8. Repeat → Continue until all tasks complete or limits reached

Engineer self-verification (MANDATORY before submission):

  • - Run relevant tests that import/reference modified files
  • If no explicit mapping, run repository's default test command
  • Any failed test or runtime exception MUST be resolved
  • Use concrete error logs and tracebacks for iterative refinement
  • Only submit commit after ALL tests pass

5. Integration via Merge

Merge workflow:

CODEBLOCK5

Main branch is single source of truth throughout execution.

6. Context Management for Manager

To prevent context explosion, manager uses LLMSummarizingCondenser pattern:

CODEBLOCK6

Compressed execution history format:
CODEBLOCK7

7. Worktree Synchronization & Cleanup

State synchronization when main advances:

CODEBLOCK8

Worktree cleanup (after completion or limit reached):

CODEBLOCK9

Worktrees are deleted after all assigned tasks are completed or when the engineer reaches the predefined iteration limit.

8. Termination Conditions

  • - Success: All units completed and integrated into main
  • Failure: Maximum rounds/iterations reached with unresolved tasks
  • Incomplete: Task considered incomplete if any units remain unresolved

Manager iteration limits (from paper):

  • - Manager: INLINECODE2
  • Each engineer: INLINECODE3
  • Total rounds: ~22 (varies by task)

9. Manager Final Review

After the asynchronous loop completes, the manager does a final review before submitting the final product.

Final review checklist:

  1. 1. Verify all tasks from dependency graph are completed
  2. Run full test suite: INLINECODE4
  3. Check integration completeness (all commits merged)
  4. Review any unresolved errors or warnings
  5. Validate final state matches expected outcome
  6. Submit final product only after verification

CODEBLOCK10

Implementation Guidelines

Using OpenClaw Sub-agents

For OpenClaw, the sessions_spawn tool enables parallel agent execution:

Spawn engineer agents:

CODEBLOCK11

Check progress:

CODEBLOCK12

Worktree Synchronization

When main advances, update worktrees:

CODEBLOCK13

This ensures engineers work from latest integrated state.

Verification Intensity vs Efficiency Trade-off

From paper analysis (Section 4.4):

StrategyPass RateRuntimeWhen to Use
Round-Manager Review60.2%3689sMaximum correctness required
Engineer Self-Verification
55.1% | 2244s | Default - balanced |
| Efficiency-Prioritized | 54.0% | 1909s | Time-critical, acceptable risk |

Default: Engineer self-verification without repeated manager review.

Common Pitfalls & Solutions

PitfallSolution
Using CAID as fallback after single-agent failsUse from outset; sequential costs ~2x with minimal gain
Soft isolation (instruction-only)
Mandatory git worktree physical isolation | | Too many engineers (>4-8) | Strict limits: 2 PaperBench, 4 Commit0, 8 absolute max | | Skipping manager pre-setup | Always prepare runtime/stubs/entry points first | | Skipping manager final review | Always do final verification before submission | | Merge conflicts from concurrent edits | Group dependent files; engineer resolves own conflicts | | Not cleaning up worktrees | Delete worktrees after completion/limit reached | | Agents develop inconsistent views | Structured JSON only; no free-form dialog | | Silent interference between agents | Explicit merge with test verification | | Tasks not clearly defined | Build dependency graph before ANY delegation | | Integration failures discovered late | Self-verification mandatory before commit | | Context explosion | Use LLMSummarizingCondenser pattern | | Missing restricted files | Mark __init__.py, configs as restricted |

Cost/Runtime Expectations

CAID trade-offs (vs single-agent):

  • - Higher API cost — Multiple agents = more LLM calls
  • Similar or longer wall-clock time — Integration is sequential/test-gated
  • Substantially higher accuracy — +26.7% PaperBench, +14.3% Commit0

When worth it: Long-horizon shared-artifact tasks where correctness matters more than speed.

Example Workflows

See references/examples.md for concrete implementation examples including:

  • - Commit0-style library implementation
  • PaperBench-style paper reproduction
  • Bug fixing (single-file vs multi-file)
  • Feature addition with API and frontend

References

  • - Paper: "Effective Strategies for Asynchronous Software Engineering Agents" (arXiv:2603.21489v1)
  • GitHub: https://github.com/JiayiGeng/async-swe-agents
  • Built on OpenHands agent SDK principles

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 caid-multi-agent-1775884048 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 caid-multi-agent-1775884048 技能

通过命令行安装

skillhub install caid-multi-agent-1775884048

下载

⬇ 下载 caid-multi-agent v1.2.0(免费)

文件大小: 12.52 KB | 发布时间: 2026-4-12 09:17

v1.2.0 最新 2026-4-12 09:17
Add manager final review phase, worktree cleanup instructions, state synchronization commands, specific iteration limits (manager:50, engineer:80, rounds:~22), round definition, and cleanup examples

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部