返回顶部
p

pdf-toolkit

Run a local script to work with PDF files, DOCX documents, OCR, and text-to-speech. Use the read tool to load this SKILL.md, then exec the uv run command inside it. Do NOT use sessions_spawn. Triggers: read pdf, extract text from pdf, merge pdfs, split pdf, rotate pdf, ocr pdf, read docx, create docx, text to speech, convert to mp3, pdf info, pdf pages.

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 0.0.3
安全检测
已通过
573
下载量
0
收藏
概述
安装方式
版本历史

pdf-toolkit

## System Dependencies - `uv` must already be installed because this skill is executed with `uv run`, and `uv` installs the Python dependencies declared in `src/main.py`. - `ffmpeg` is needed for `tts` because the speech output is normalized and written as an `.mp3` file through `ffmpeg`. - `tesseract` is needed for `ocr` because it performs the actual optical character recognition on scanned page images. - `pdfimages` is also needed for `ocr` because it extracts page images from PDFs before those images are passed to `tesseract`; `pdfimages` comes from `poppler`. - `pandoc` is optional for `convert` because it can convert between many document formats when text-based conversion is possible. - `libreoffice` is an optional alternative to `pandoc` for `convert` because it can handle document conversions that `pandoc` may not support well. ## File Access And Network Behavior - This skill operates on the file paths provided by the caller. It can read from and write to any host path the caller supplies; it is not limited to the OpenClaw workspace. - The `/root/.openclaw/workspace/...` paths in the command examples show where the skill entrypoint lives. They do not restrict which files the skill can access. - The `tts` command uses `edge-tts`, which sends the input text to an external text-to-speech service over the network to generate audio. - Do not use `tts` with sensitive or private text unless you are comfortable sending that text off-host. - All other commands run locally on the host, subject to the optional local binaries documented below. # Skill: PDF Toolkit ## When to use - User wants to extract text, tables, or images from a PDF. - User wants to get metadata or page count from a PDF. - User wants to merge, split, or rotate a PDF. - User wants to create a new PDF from plain text or Markdown. - User wants to read or write a DOCX file. - User wants to OCR a scanned PDF (requires `tesseract` on host). - User wants to convert text or a document to an MP3 audio file (requires `ffmpeg` on host). - User wants to convert between document formats (requires `pandoc` or `libreoffice` on host). - User wants to check which optional system tools are available. ## When NOT to use - User wants to view or render a PDF visually — use a PDF viewer. - User wants to fill in PDF form fields — this skill does not support AcroForms. - User wants to edit an existing PDF's text in-place — use a dedicated PDF editor. ## Commands ### Check available tools ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py doctor ``` ### Get PDF metadata and page count ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py info <pdf_path> ``` ### Extract text from a PDF ```bash # All pages uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-text <pdf_path> # Specific pages (1-indexed, comma-separated or ranges) uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-text <pdf_path> --pages 1,3,5-8 ``` ### Extract tables from a PDF ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-tables <pdf_path> uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-tables <pdf_path> --pages 2-4 ``` ### Extract images from a PDF ```bash # Saves images to current directory by default uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-images <pdf_path> uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-images <pdf_path> --output-dir /path/to/output ``` ### Merge PDFs ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py merge <pdf1> <pdf2> [<pdf3> ...] --output merged.pdf ``` ### Split a PDF ```bash # Split into individual pages uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py split <pdf_path> --output-dir /path/to/output # Extract a page range into a new PDF uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py split <pdf_path> --pages 2-5 --output extracted.pdf ``` ### Rotate pages in a PDF ```bash # Rotate all pages 90 degrees clockwise uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py rotate <pdf_path> --degrees 90 --output rotated.pdf # Rotate specific pages uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py rotate <pdf_path> --degrees 180 --pages 1,3 --output rotated.pdf ``` ### Create a PDF from text ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py create-pdf --text "Hello, world!" --output hello.pdf uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py create-pdf --file input.txt --output document.pdf ``` ### Read a DOCX file ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py read-docx <docx_path> ``` ### Write a DOCX file ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py write-docx --text "Content here" --output document.docx uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py write-docx --file input.txt --output document.docx ``` ### OCR a scanned PDF (requires tesseract) ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py ocr <pdf_path> uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py ocr <pdf_path> --pages 1-3 --lang eng ``` ### Convert text or document to speech (requires ffmpeg) ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --text "Hello, world!" --output speech.mp3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --file input.txt --output speech.mp3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --file document.pdf --output speech.mp3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --text "Hello" --voice en-GB-SoniaNeural --output speech.mp3 ``` ### Convert document formats (requires pandoc or libreoffice) ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py convert <input_path> --output <output_path> ``` ### Examples ```bash # Inspect a PDF uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py info report.pdf # Pull text from pages 1–3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-text report.pdf --pages 1-3 # Merge two PDFs uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py merge a.pdf b.pdf --output combined.pdf # OCR a scanned document uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py ocr scan.pdf # Read a Word document uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py read-docx report.docx # Text to MP3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --text "Welcome to the future." --output welcome.mp3 # Check what is available on this host uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py doctor ``` ## Chat Delivery - When this skill is used in a chat interface that supports file attachments, such as Telegram, any generated output file should be sent back to the user as an attachment after successful creation or conversion. - This applies to commands that create files, including `create-pdf`, `write-docx`, `extract-images`, `merge`, `split`, `rotate`, `tts`, and `convert`. - If a temporary output file is created in the Claw runtime temporary folder for delivery, delete that temporary file immediately after the file has been sent successfully to the user. - Do not delete files that were written to a user-requested destination outside the Claw temporary folder. - If the chat environment cannot send file attachments, report the output path clearly instead of claiming the file was delivered. ## Output - Plain text with labeled sections separated by blank lines. - Errors are prefixed with `Error:`. - The `doctor` command shows a table of available and missing tools. ## Notes - `uv run` reads the inline `# /// script` dependency block in `main.py` and auto-installs Python packages in an isolated environment — no pip install or venv setup needed. - **Core features** (info, extract-text, extract-tables, merge, split, rotate, create-pdf, read-docx, write-docx) work with `uv` alone — no system binaries required. - **OCR** requires `tesseract` installed on the host (`brew install tesseract` / `apt install tesseract-ocr`). Also needs `pdfimages` from poppler (`brew install poppler`). - **TTS** requires `ffmpeg` installed on the host (`brew install ffmpeg` / `apt install ffmpeg`). - **Document conversion** requires `pandoc` or `libreoffice` on the host. - Run `doctor` first if you are unsure which features are available.

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 pdf-toolkit-1776062180 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 pdf-toolkit-1776062180 技能

通过命令行安装

skillhub install pdf-toolkit-1776062180

下载 Zip 包

⬇ 下载 pdf-toolkit v0.0.3

文件大小: 20.81 KB | 发布时间: 2026-4-14 10:08

v0.0.3 最新 2026-4-14 10:08
### v0.0.3 Changelog

- Added "Chat Delivery" section to SKILL.md with guidelines for sending generated files as chat attachments and cleaning up temporary files.
- Clarified file delivery and deletion behavior for output files in chat interfaces.
- No functional code or command changes; documentation update only.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部