Firecrawl can recursively search through a urls subdomains, and gather the content
crawlEntireDomain
parameter. To crawl subdomains like blog.website.com when crawling website.com, use the allowSubdomains
parameter.async crawl
functions on SDKs, this will return an ID
where you can use to check the status of the crawl.
next
URL parameter is provided. You must request this URL to retrieve the next 10MB of data. If the next
parameter is absent, it indicates the end of the crawl data.
The skip parameter sets the maximum number of results returned for each chunk of results returned.
crawl_url
/crawlUrl
):
async_crawl_url
/asyncCrawlUrl
):
maxAge
to your scrapeOptions
to use cached page data when available.
maxAge
maxAge
usage, see the Faster Scraping documentation.
Crawl URL and Watch
, enables real-time data extraction and monitoring. Start a crawl with a URL and customize it with options like page limits, allowed domains, and output formats, ideal for immediate data processing needs.
crawl.started
- When the crawl beginscrawl.page
- For each page successfully scrapedcrawl.completed
- When the crawl finishescrawl.failed
- If the crawl encounters an error