返回顶部
f

firecrawl

Web scraping and content extraction using Firecrawl API. Use when users need to crawl websites, extract structured data, convert web pages to markdown, scrape multiple URLs, or build knowledge bases from web content. Supports single page extraction, site-wide crawling, batch processing, and structured data extraction with CSS selectors.

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.0.0
安全检测
已通过
112
下载量
0
收藏
概述
安装方式
版本历史

firecrawl

# Firecrawl Skill Powerful web scraping powered by [Firecrawl](https://github.com/mendableai/firecrawl) - turn websites into LLM-ready markdown. ## Overview Firecrawl provides APIs for: - **Scrape** - Single page extraction to markdown - **Crawl** - Entire site crawling with depth control - **Map** - URL discovery from a starting point - **Batch** - Multiple URL processing - **Extract** - Structured data extraction with schemas ## Prerequisites 1. **Firecrawl API Key** - Get free tier at https://firecrawl.dev 2. Install Python dependencies: `requests` ## Configuration Set environment variable: ```bash export FIRECRAWL_API_KEY="fc-your-api-key" ``` ## Usage ### Single Page Scraping ```bash # Basic scrape firecrawl scrape https://example.com # With specific options firecrawl scrape https://example.com --formats markdown,html --only-main-content # Wait for JS rendering firecrawl scrape https://spa-app.com --wait-for 2000 ``` ### Site Crawling ```bash # Crawl entire site (up to limit) firecrawl crawl https://docs.example.com --limit 50 # With depth control firecrawl crawl https://blog.example.com --max-depth 2 --limit 100 # Include/exclude patterns firecrawl crawl https://site.com --include "/blog/*" --exclude "/admin/*" # Custom formats firecrawl crawl https://docs.example.com --formats markdown,links ``` ### URL Mapping ```bash # Discover all URLs from a site firecrawl map https://example.com # With search term firecrawl map https://docs.python.org --search "tutorial" ``` ### Batch Processing ```bash # Scrape multiple URLs firecrawl batch urls.txt --output ./scraped/ # From JSON list firecrawl batch urls.json --formats markdown --concurrency 5 ``` ### Structured Extraction ```bash # Extract specific data using CSS selectors firecrawl extract https://example.com/products \ --schema '{"name": ".product-title", "price": ".price", "description": ".desc"}' # Extract to JSON firecrawl extract https://news.example.com/article --schema article-schema.json ``` ## Output Formats ### Markdown Clean, LLM-ready markdown with: - Headings preserved - Links converted to markdown format - Images with alt text - Tables formatted as markdown tables ### HTML Raw or cleaned HTML ### Links Extracted link lists for further crawling ### Screenshot Page screenshot (if requested) ## Use Cases ### Knowledge Base Building ```bash # Crawl documentation site firecrawl crawl https://docs.framework.com --limit 200 -o ./kb/ # Merge into single file for RAG cat ./kb/*.md > knowledge-base.md ``` ### Research & Analysis ```bash # Scrape competitor pricing firecrawl batch competitors.txt --extract pricing-schema.json # Monitor blog updates firecrawl map https://blog.company.com --since 2024-01-01 ``` ### Content Migration ```bash # Export old CMS content firecrawl crawl https://old-site.com --formats markdown,html -o ./export/ ``` ## Scripts All functionality via `scripts/firecrawl.py`: - Handles API authentication - Automatic rate limiting - Retry logic for failures - Progress tracking for large crawls ## Integration Works well with: - `markdown-sync-pro` - Sync scraped content to Notion/GitHub - `arxiv-paper` - Combine with academic paper downloads - `maybe-finance` - Scrape financial data for analysis

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 web-scraper-firecrawl-1776115814 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 web-scraper-firecrawl-1776115814 技能

通过命令行安装

skillhub install web-scraper-firecrawl-1776115814

下载 Zip 包

⬇ 下载 firecrawl v1.0.0

文件大小: 5.73 KB | 发布时间: 2026-4-14 10:17

v1.0.0 最新 2026-4-14 10:17
Initial release

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部