Nota: Ya está disponible una versión v2 de esta API con mejor rendimiento y mayor confiabilidad en el procesamiento por lotes.
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
The URL to scrape
A webhook specification object.
Maximum number of concurrent scrapes. This parameter allows you to set a concurrency limit for this batch scrape. If not specified, the batch scrape adheres to your team's concurrency limit.
If invalid URLs are specified in the urls array, they will be ignored. Instead of them failing the entire request, a batch scrape using the remaining valid URLs will be created, and the invalid URLs will be returned in the invalidURLs field of the response.
If true, this will enable zero data retention for this batch scrape. To enable this feature, please contact help@firecrawl.dev
Only return the main content of the page excluding headers, navs, footers, etc.
Tags to include in the output.
Tags to exclude from the output.
Returns a cached version of the page if it is younger than this age in milliseconds. If a cached version of the page is older than this value, the page will be scraped. If you do not need extremely fresh data, enabling this can speed up your scrapes by 500%. Defaults to 0, which disables caching.
Headers to send with the request. Can be used to send cookies, user-agent, etc.
Specify a delay in milliseconds before fetching the content, allowing the page sufficient time to load.
Set to true if you want to emulate scraping from a mobile device. Useful for testing responsive pages and taking mobile screenshots.
Skip TLS certificate verification when making requests
Timeout in milliseconds for the request
Controls how PDF files are processed during scraping. When true, the PDF content is extracted and converted to markdown format, with billing based on the number of pages (1 credit per page). When false, the PDF file is returned in base64 encoding with a flat rate of 1 credit total.
JSON options object
Actions to perform on the page before grabbing the content
Location settings for the request. When specified, this will use an appropriate proxy if available and emulate the corresponding language and timezone settings. Defaults to 'US' if not specified.
Removes all base 64 images from the output, which may be overwhelmingly long. The image's alt text remains in the output, but the URL is replaced with a placeholder.
Enables ad-blocking and cookie popup blocking.
Specifies the type of proxy to use.
- basic: Proxies for scraping sites with none to basic anti-bot solutions. Fast and usually works.
- stealth: Stealth proxies for scraping sites with advanced anti-bot solutions. Slower, but more reliable on certain sites. Costs up to 5 credits per request.
- auto: Firecrawl will automatically retry scraping with stealth proxies if the basic proxy fails. If the retry with stealth is successful, 5 credits will be billed for the scrape. If the first attempt with basic is successful, only the regular cost will be billed.
If you do not specify a proxy, Firecrawl will default to basic.
basic
, stealth
, auto
If true, the page will be stored in the Firecrawl index and cache. Setting this to false is useful if your scraping activity may have data protection concerns. Using some parameters associated with sensitive scraping (actions, headers) will force this parameter to be false.
Formats to include in the output.
Options for change tracking (Beta). Only applicable when 'changeTracking' is included in formats. The 'markdown' format must also be specified when using change tracking.
Response
Successful response