Firecrawl allows you to turn entire websites into LLM-ready markdown
The updated /scrape
endpoint has been redesigned for enhanced reliability and ease of use. The structure of the new /scrape
request body is as follows:
You can now choose what formats you want your output in. You can specify multiple output formats. Supported formats are:
The table below outlines the changes to the request body parameters for the /scrape
endpoint in V1.
Parameter | Change | Description |
---|---|---|
onlyIncludeTags | Moved and Renamed | Moved to root level. And renamed to includeTags . |
removeTags | Moved and Renamed | Moved to root level. And renamed to excludeTags . |
onlyMainContent | Moved | Moved to root level. true by default. |
waitFor | Moved | Moved to root level. |
headers | Moved | Moved to root level. |
parsePDF | Moved | Moved to root level. |
extractorOptions | No Change | |
timeout | No Change | |
pageOptions | Removed | No need for pageOptions parameter. The scrape options were moved to root level. |
replaceAllPathsWithAbsolutePaths | Removed | replaceAllPathsWithAbsolutePaths is not needed anymore. Every path is now default to absolute path. |
includeHtml | Removed | add "html" to formats instead. |
includeRawHtml | Removed | add "rawHtml" to formats instead. |
screenshot | Removed | add "screenshot" to formats instead. |
fullPageScreenshot | Removed | add "screenshot@fullPage" to formats instead. |
extractorOptions | Removed | Use "json" format instead with jsonOptions object. |
The new json
format is described in the llm-extract section.
We’ve also updated the /crawl
endpoint on v1
. Check out the improved body request below:
The table below outlines the changes to the request body parameters for the /crawl
endpoint in V1.
Parameter | Change | Description |
---|---|---|
pageOptions | Renamed | Renamed to scrapeOptions . |
includes | Moved and Renamed | Moved to root level. Renamed to includePaths . |
excludes | Moved and Renamed | Moved to root level. Renamed to excludePaths . |
allowBackwardCrawling | Moved and Renamed | Moved to root level. Renamed to allowBackwardLinks . |
allowExternalLinks | Moved | Moved to root level. |
maxDepth | Moved | Moved to root level. |
ignoreSitemap | Moved | Moved to root level. |
limit | Moved | Moved to root level. |
crawlerOptions | Removed | No need for crawlerOptions parameter. The crawl options were moved to root level. |
timeout | Removed | Use timeout in scrapeOptions instead. |
Firecrawl allows you to turn entire websites into LLM-ready markdown
The updated /scrape
endpoint has been redesigned for enhanced reliability and ease of use. The structure of the new /scrape
request body is as follows:
You can now choose what formats you want your output in. You can specify multiple output formats. Supported formats are:
The table below outlines the changes to the request body parameters for the /scrape
endpoint in V1.
Parameter | Change | Description |
---|---|---|
onlyIncludeTags | Moved and Renamed | Moved to root level. And renamed to includeTags . |
removeTags | Moved and Renamed | Moved to root level. And renamed to excludeTags . |
onlyMainContent | Moved | Moved to root level. true by default. |
waitFor | Moved | Moved to root level. |
headers | Moved | Moved to root level. |
parsePDF | Moved | Moved to root level. |
extractorOptions | No Change | |
timeout | No Change | |
pageOptions | Removed | No need for pageOptions parameter. The scrape options were moved to root level. |
replaceAllPathsWithAbsolutePaths | Removed | replaceAllPathsWithAbsolutePaths is not needed anymore. Every path is now default to absolute path. |
includeHtml | Removed | add "html" to formats instead. |
includeRawHtml | Removed | add "rawHtml" to formats instead. |
screenshot | Removed | add "screenshot" to formats instead. |
fullPageScreenshot | Removed | add "screenshot@fullPage" to formats instead. |
extractorOptions | Removed | Use "json" format instead with jsonOptions object. |
The new json
format is described in the llm-extract section.
We’ve also updated the /crawl
endpoint on v1
. Check out the improved body request below:
The table below outlines the changes to the request body parameters for the /crawl
endpoint in V1.
Parameter | Change | Description |
---|---|---|
pageOptions | Renamed | Renamed to scrapeOptions . |
includes | Moved and Renamed | Moved to root level. Renamed to includePaths . |
excludes | Moved and Renamed | Moved to root level. Renamed to excludePaths . |
allowBackwardCrawling | Moved and Renamed | Moved to root level. Renamed to allowBackwardLinks . |
allowExternalLinks | Moved | Moved to root level. |
maxDepth | Moved | Moved to root level. |
ignoreSitemap | Moved | Moved to root level. |
limit | Moved | Moved to root level. |
crawlerOptions | Removed | No need for crawlerOptions parameter. The crawl options were moved to root level. |
timeout | Removed | Use timeout in scrapeOptions instead. |