POST
/
scrape
curl --request POST \
  --url https://api.firecrawl.dev/v0/scrape \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "url": "<string>",
  "pageOptions": {
    "headers": {},
    "includeHtml": false,
    "includeRawHtml": false,
    "onlyIncludeTags": [
      "<string>"
    ],
    "onlyMainContent": false,
    "removeTags": [
      "<string>"
    ],
    "replaceAllPathsWithAbsolutePaths": false,
    "screenshot": false,
    "fullPageScreenshot": false,
    "waitFor": 0
  },
  "extractorOptions": {},
  "timeout": 30000
}'
{
  "success": true,
  "data": {
    "markdown": "<string>",
    "content": "<string>",
    "html": "<string>",
    "rawHtml": "<string>",
    "metadata": {
      "title": "<string>",
      "description": "<string>",
      "language": "<string>",
      "sourceURL": "<string>",
      "<any other metadata> ": "<string>",
      "pageStatusCode": 123,
      "pageError": "<string>"
    },
    "llm_extraction": {},
    "warning": "<string>"
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
url
string
required

The URL to scrape

pageOptions
object
extractorOptions
object

Options for extraction of structured information from the page content. Note: LLM-based extraction is not performed by default and only occurs when explicitly configured. The 'markdown' mode simply returns the scraped markdown and is the default mode for scraping.

timeout
integer
default:
30000

Timeout in milliseconds for the request

Response

200
application/json
Successful response
success
boolean
data
object