返回顶部
a

agent-paddleocr-vision

Multi-language document understanding with PaddleOCR

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.1.0
安全检测
已通过
226
下载量
0
收藏
概述
安装方式
版本历史

agent-paddleocr-vision

# Agent PaddleOCR Vision **OCR with Agent Actions — powered by PaddleOCR only.** Automatically classifies documents and provides actionable prompts. ## What It Does - OCR extraction via **PaddleOCR cloud API** (requires credentials) - 11 document types: invoice, business card, receipt, table, contract, ID card, passport, bank statement, driver's license, tax form, general - Action suggestion with structured parameters - Batch processing - Searchable PDF generation (with bbox alignment) ## Quick Start ```bash # Install dependencies pip3 install -r scripts/requirements.txt # Configure PaddleOCR API export PADDLEOCR_DOC_PARSING_API_URL=https://your-api.paddleocr.com/layout-parsing export PADDLEOCR_ACCESS_TOKEN=your_token # Process a file python3 scripts/doc_vision.py --file-path ./invoice.jpg --pretty --make-searchable-pdf ``` ## Batch ```bash python3 scripts/doc_vision.py --batch-dir ./inbox --output-dir ./out ``` ## Output See `docs/README.zh.md` for full JSON schema and integration guide. ## Supported Types | Type | Actions | |------|---------| | Invoice | create_expense, archive, tax_report | | Business Card | add_contact, save_vcard | | Receipt | create_expense, split_bill | | Table | export_csv, analyze_data | | Contract | summarize, extract_dates, flag_obligations | | ID Card | extract_id_info, verify_age | | Passport | store_passport_info, check_validity | | Bank Statement | categorize_transactions, generate_report | | Driver License | store_license_info, check_expiry | | Tax Form | summarize_tax, suggest_deductions | | General | summarize, translate, search_keywords | ## Configuration Required environment variables: - `PADDLEOCR_DOC_PARSING_API_URL` — API endpoint ending in `/layout-parsing` - `PADDLEOCR_ACCESS_TOKEN` — Access token Optional: - `PADDLEOCR_DOC_PARSING_TIMEOUT` — Default 600 seconds ## Searchable PDF With `--make-searchable-pdf`, embeds OCR text layer aligned to original layout using bounding boxes. Requires `pdf2image` + `poppler` (system) and `reportlab`, `pypdf`, `pillow` (Python). ## Full Documentation Detailed usage, troubleshooting, and development guide available in multiple languages under `docs/`: - 中文: `docs/README.zh.md` - English: `docs/README.en.md` - Español: `docs/README.es.md` - العربية: `docs/README.ar.md` ## License MIT-0 --- **Made for OpenClaw.** Let your agent see and act.

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 agent-paddleocr-vision-1776075303 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 agent-paddleocr-vision-1776075303 技能

通过命令行安装

skillhub install agent-paddleocr-vision-1776075303

下载 Zip 包

⬇ 下载 agent-paddleocr-vision v1.1.0

文件大小: 54.45 KB | 发布时间: 2026-4-14 15:56

v1.1.0 最新 2026-4-14 15:56
- Documentation moved to the new docs/ directory with multi-language support (Arabic, English, Spanish, Chinese).
- Removed template files for document types (e.g., bank_statement, business_card, invoice, etc.).
- Cleaned up project structure by deleting unused and redundant files.
- README and integration details now consolidated and easier to navigate.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部