Batch Scrape
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
Actions to perform on the page before grabbing the content
Tags to exclude from the output.
Formats to include in the output.
markdown
, html
, rawHtml
, links
, screenshot
, extract
, screenshot@fullPage
, json
Headers to send with the request. Can be used to send cookies, user-agent, etc.
If invalid URLs are specified in the urls array, they will be ignored. Instead of them failing the entire request, a batch scrape using the remaining valid URLs will be created, and the invalid URLs will be returned in the invalidURLs field of the response.
Tags to include in the output.
Extract object
Location settings for the request. When specified, this will use an appropriate proxy if available and emulate the corresponding language and timezone settings. Defaults to 'US' if not specified.
Set to true if you want to emulate scraping from a mobile device. Useful for testing responsive pages and taking mobile screenshots.
Only return the main content of the page excluding headers, navs, footers, etc.
Removes all base 64 images from the output, which may be overwhelmingly long. The image's alt text remains in the output, but the URL is replaced with a placeholder.
Skip TLS certificate verification when making requests
Timeout in milliseconds for the request
Specify a delay in milliseconds before fetching the content, allowing the page sufficient time to load.
The URL to send the webhook to. This will trigger for batch scrape started (batch_scrape.started), every page scraped (batch_scrape.page) and when the batch scrape is completed (batch_scrape.completed or batch_scrape.failed). The response will be the same as the /scrape
endpoint.
Response
If ignoreInvalidURLs is true, this is an array containing the invalid URLs that were specified in the request. If there were no invalid URLs, this will be an empty array. If ignoreInvalidURLs is false, this field will be undefined.