Learn how to improve your Firecrawl scraping with advanced options.
/scrape
endpoint.
/scrape
endpoint to scrape a pdf link and get the text content of the pdf. You can disable this by setting pageOptions.parsePDF
to false
.
/scrape
endpoint, you can customize the scraping behavior with the pageOptions
parameter. Here are the available options:
onlyMainContent
boolean
false
includeHtml
boolean
html
key in the response.false
includeRawHtml
boolean
rawHtml
key in the response.false
screenshot
boolean
false
waitFor
integer
0
html
key./scrape
endpoint, you can specify options for extracting structured information from the page content using the extractorOptions
parameter. Here are the available options:
string
["llm-extraction", "llm-extraction-from-raw-html"]
llm-extraction
: Extracts information from the cleaned and parsed content.llm-extraction-from-raw-html
: Extracts information directly from the raw HTML.string
object
timeout
parameter in milliseconds.
/crawl
endpoint. This endpoint allows you to specify a base URL you want to crawl and all accessible subpages will be crawled.
/crawl
endpoint, you can customize the crawling behavior with the crawlerOptions
parameter. Here are the available options:
includes
array
["/blog/*", "/products/*"]
excludes
array
["/admin/*", "/login/*"]
returnOnlyUrls
boolean
true
, the response will only include a list of URLs instead of the full document data.false
maxDepth
integer
2
mode
string
["default", "fast"]
fast
mode crawls websites without a sitemap 4x faster but may be less accurate and is not recommended for heavily JavaScript-rendered websites.default
limit
integer
10000
/blog/*
and /products/*
./admin/*
and /login/*
.pageOptions
and crawlerOptions
parameters to customize both the full crawling behavior.
/blog/*
and /products/*
.