Migrating from V0

/scrape endpoint

The updated /scrape endpoint has been redesigned for enhanced reliability and ease of use. The structure of the new /scrape request body is as follows:

{
  "url": "<string>",
  "formats": ["markdown", "html", "rawHtml", "links", "screenshot", "json"],
  "includeTags": ["<string>"],
  "excludeTags": ["<string>"],
  "headers": { "<key>": "<value>" },
  "waitFor": 123,
  "timeout": 123
}

Formats

You can now choose what formats you want your output in. You can specify multiple output formats. Supported formats are:

Markdown (markdown)
HTML (html)
Raw HTML (rawHtml) (with no modifications)
Screenshot (screenshot or screenshot@fullPage)
Links (links)
JSON (json) By default, the output will be include only the markdown format.

Details on the new request body

The table below outlines the changes to the request body parameters for the /scrape endpoint in V1.

Parameter	Change	Description
`onlyIncludeTags`	Moved and Renamed	Moved to root level. And renamed to `includeTags`.
`removeTags`	Moved and Renamed	Moved to root level. And renamed to `excludeTags`.
`onlyMainContent`	Moved	Moved to root level. `true` by default.
`waitFor`	Moved	Moved to root level.
`headers`	Moved	Moved to root level.
`parsePDF`	Moved	Moved to root level.
`extractorOptions`	No Change
`timeout`	No Change
`pageOptions`	Removed	No need for `pageOptions` parameter. The scrape options were moved to root level.
`replaceAllPathsWithAbsolutePaths`	Removed	`replaceAllPathsWithAbsolutePaths` is not needed anymore. Every path is now default to absolute path.
`includeHtml`	Removed	add `"html"` to `formats` instead.
`includeRawHtml`	Removed	add `"rawHtml"` to `formats` instead.
`screenshot`	Removed	add `"screenshot"` to `formats` instead.
`fullPageScreenshot`	Removed	add `"screenshot@fullPage"` to `formats` instead.
`extractorOptions`	Removed	Use `"json"` format instead with `jsonOptions` object.

The new json format is described in the llm-extract section.

/crawl endpoint

We’ve also updated the /crawl endpoint on v1. Check out the improved body request below:

{
  "url": "<string>",
  "excludePaths": ["<string>"],
  "includePaths": ["<string>"],
  "maxDepth": 2,
  "ignoreSitemap": true,
  "limit": 10,
  "allowBackwardLinks": true,
  "allowExternalLinks": true,
  "scrapeOptions": {
    // same options as in /scrape
    "formats": ["markdown", "html", "rawHtml", "screenshot", "links"],
    "headers": { "<key>": "<value>" },
    "includeTags": ["<string>"],
    "excludeTags": ["<string>"],
    "onlyMainContent": true,
    "waitFor": 123
  }
}

Details on the new request body

The table below outlines the changes to the request body parameters for the /crawl endpoint in V1.

Parameter	Change	Description
`pageOptions`	Renamed	Renamed to `scrapeOptions`.
`includes`	Moved and Renamed	Moved to root level. Renamed to `includePaths`.
`excludes`	Moved and Renamed	Moved to root level. Renamed to `excludePaths`.
`allowBackwardCrawling`	Moved and Renamed	Moved to root level. Renamed to `allowBackwardLinks`.
`allowExternalLinks`	Moved	Moved to root level.
`maxDepth`	Moved	Moved to root level.
`ignoreSitemap`	Moved	Moved to root level.
`limit`	Moved	Moved to root level.
`crawlerOptions`	Removed	No need for `crawlerOptions` parameter. The crawl options were moved to root level.
`timeout`	Removed	Use `timeout` in `scrapeOptions` instead.

Get Started

Standard Features

Agentic Features

Contributing

/scrape endpoint

Formats

Details on the new request body

/crawl endpoint

Details on the new request body

Get Started

Standard Features

Agentic Features

Contributing

​/scrape endpoint

​Formats

​Details on the new request body

​/crawl endpoint

​Details on the new request body

/scrape endpoint

Formats

Details on the new request body

/crawl endpoint

Details on the new request body