batch-file-processor

# Batch File Processor Process large numbers of files in parallel using sub-agents, avoiding main agent context overflow. ## Workflow ### 1. List files ```bash find <directory> -type f -name "*.md" | sort ``` ### 2. Group Split into batches of 2-4 files each (3 is optimal). ### 3. Dispatch sub-agents One sub-agent per batch. Task template: ``` Read the following files completely and generate a brief summary (under 50 words) for each. 1. /path/to/file1.md 2. /path/to/file2.md 3. /path/to/file3.md Return ONLY a JSON array: [{"file": "relative/path/file1.md", "summary": "..."},...] ``` Key parameters: - `mode`: "run" (one-shot task) - `runTimeoutSeconds`: 120 (increase to 180 for large files) - `label`: descriptive label, e.g. `idx-project-batch1` ### 4. Collect results Sub-agents push results on completion. Use `sessions_yield` to wait and collect incrementally. ### 5. Compile output Once all results are in, the main agent compiles the final deliverable (index file, report, etc.). ## Rules - **2-4 files per sub-agent** — never let one sub-agent process an entire directory sequentially - **Read full file content** — no head/tail truncation; partial reads produce incomplete summaries - **Standardize output format** — JSON makes it easy for the main agent to parse and merge - **One spawn per turn** — system limitation; use multiple spawn + yield cycles ## Anti-patterns | Mistake | Consequence | |---------|-------------| | `head -20` to skim file headers | Poor summary quality, key information missed | | One sub-agent processes entire directory | Context overflow, timeout failure | | Main agent reads all files sequentially | Context window exhausted, later files unreadable | | One sub-agent per large directory | Large directories timeout, small ones waste capacity | ## Benchmarks 70 files → 25 sub-agents (3 files each) → parallel execution → completed in 5 minutes → high accuracy summaries ## Task Template Variants ### File summarization (default) ``` Generate a brief summary (under 50 words) for each file. ``` ### Information extraction ``` Extract the following fields from each file: project name, budget, key contacts, risks. Return JSON: [{"file": "...", "project": "...", "budget": "...", "contacts": [...], "risks": [...]}] ``` ### Content classification ``` Classify each file by checking for these topics: security, compliance, migration. Return JSON: [{"file": "...", "has_security": true/false, "has_compliance": true/false, "has_migration": true/false}] ``` ### Code analysis ``` Analyze each source file: count lines, list imports/dependencies, identify main functions. Return JSON: [{"file": "...", "lines": N, "imports": [...], "main_functions": [...]}] ```

batch-file-processor

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

batch-file-processor