pdf-toolkit

## System Dependencies - `uv` must already be installed because this skill is executed with `uv run`, and `uv` installs the Python dependencies declared in `src/main.py`. - `ffmpeg` is needed for `tts` because the speech output is normalized and written as an `.mp3` file through `ffmpeg`. - `tesseract` is needed for `ocr` because it performs the actual optical character recognition on scanned page images. - `pdfimages` is also needed for `ocr` because it extracts page images from PDFs before those images are passed to `tesseract`; `pdfimages` comes from `poppler`. - `pandoc` is optional for `convert` because it can convert between many document formats when text-based conversion is possible. - `libreoffice` is an optional alternative to `pandoc` for `convert` because it can handle document conversions that `pandoc` may not support well. ## File Access And Network Behavior - This skill operates on the file paths provided by the caller. It can read from and write to any host path the caller supplies; it is not limited to the OpenClaw workspace. - The `/root/.openclaw/workspace/...` paths in the command examples show where the skill entrypoint lives. They do not restrict which files the skill can access. - The `tts` command uses `edge-tts`, which sends the input text to an external text-to-speech service over the network to generate audio. - Do not use `tts` with sensitive or private text unless you are comfortable sending that text off-host. - All other commands run locally on the host, subject to the optional local binaries documented below. # Skill: PDF Toolkit ## When to use - User wants to extract text, tables, or images from a PDF. - User wants to get metadata or page count from a PDF. - User wants to merge, split, or rotate a PDF. - User wants to create a new PDF from plain text or Markdown. - User wants to read or write a DOCX file. - User wants to OCR a scanned PDF (requires `tesseract` on host). - User wants to convert text or a document to an MP3 audio file (requires `ffmpeg` on host). - User wants to convert between document formats (requires `pandoc` or `libreoffice` on host). - User wants to check which optional system tools are available. ## When NOT to use - User wants to view or render a PDF visually — use a PDF viewer. - User wants to fill in PDF form fields — this skill does not support AcroForms. - User wants to edit an existing PDF's text in-place — use a dedicated PDF editor. ## Commands ### Check available tools ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py doctor ``` ### Get PDF metadata and page count ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py info <pdf_path> ``` ### Extract text from a PDF ```bash # All pages uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-text <pdf_path> # Specific pages (1-indexed, comma-separated or ranges) uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-text <pdf_path> --pages 1,3,5-8 ``` ### Extract tables from a PDF ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-tables <pdf_path> uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-tables <pdf_path> --pages 2-4 ``` ### Extract images from a PDF ```bash # Saves images to current directory by default uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-images <pdf_path> uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-images <pdf_path> --output-dir /path/to/output ``` ### Merge PDFs ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py merge <pdf1> <pdf2> [<pdf3> ...] --output merged.pdf ``` ### Split a PDF ```bash # Split into individual pages uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py split <pdf_path> --output-dir /path/to/output # Extract a page range into a new PDF uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py split <pdf_path> --pages 2-5 --output extracted.pdf ``` ### Rotate pages in a PDF ```bash # Rotate all pages 90 degrees clockwise uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py rotate <pdf_path> --degrees 90 --output rotated.pdf # Rotate specific pages uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py rotate <pdf_path> --degrees 180 --pages 1,3 --output rotated.pdf ``` ### Create a PDF from text ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py create-pdf --text "Hello, world!" --output hello.pdf uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py create-pdf --file input.txt --output document.pdf ``` ### Read a DOCX file ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py read-docx <docx_path> ``` ### Write a DOCX file ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py write-docx --text "Content here" --output document.docx uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py write-docx --file input.txt --output document.docx ``` ### OCR a scanned PDF (requires tesseract) ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py ocr <pdf_path> uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py ocr <pdf_path> --pages 1-3 --lang eng ``` ### Convert text or document to speech (requires ffmpeg) ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --text "Hello, world!" --output speech.mp3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --file input.txt --output speech.mp3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --file document.pdf --output speech.mp3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --text "Hello" --voice en-GB-SoniaNeural --output speech.mp3 ``` ### Convert document formats (requires pandoc or libreoffice) ```bash uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py convert <input_path> --output <output_path> ``` ### Examples ```bash # Inspect a PDF uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py info report.pdf # Pull text from pages 1–3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-text report.pdf --pages 1-3 # Merge two PDFs uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py merge a.pdf b.pdf --output combined.pdf # OCR a scanned document uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py ocr scan.pdf # Read a Word document uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py read-docx report.docx # Text to MP3 uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --text "Welcome to the future." --output welcome.mp3 # Check what is available on this host uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py doctor ``` ## Chat Delivery - When this skill is used in a chat interface that supports file attachments, such as Telegram, any generated output file should be sent back to the user as an attachment after successful creation or conversion. - This applies to commands that create files, including `create-pdf`, `write-docx`, `extract-images`, `merge`, `split`, `rotate`, `tts`, and `convert`. - If a temporary output file is created in the Claw runtime temporary folder for delivery, delete that temporary file immediately after the file has been sent successfully to the user. - Do not delete files that were written to a user-requested destination outside the Claw temporary folder. - If the chat environment cannot send file attachments, report the output path clearly instead of claiming the file was delivered. ## Output - Plain text with labeled sections separated by blank lines. - Errors are prefixed with `Error:`. - The `doctor` command shows a table of available and missing tools. ## Notes - `uv run` reads the inline `# /// script` dependency block in `main.py` and auto-installs Python packages in an isolated environment — no pip install or venv setup needed. - **Core features** (info, extract-text, extract-tables, merge, split, rotate, create-pdf, read-docx, write-docx) work with `uv` alone — no system binaries required. - **OCR** requires `tesseract` installed on the host (`brew install tesseract` / `apt install tesseract-ocr`). Also needs `pdfimages` from poppler (`brew install poppler`). - **TTS** requires `ffmpeg` installed on the host (`brew install ffmpeg` / `apt install ffmpeg`). - **Document conversion** requires `pandoc` or `libreoffice` on the host. - Run `doctor` first if you are unsure which features are available.

pdf-toolkit

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

pdf-toolkit

pdf-toolkit

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement