Scrape
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
The URL to scrape
Only return the main content of the page excluding headers, navs, footers, etc.
Include the raw HTML content of the page. Will output a html key in the response.
Options for LLM-based extraction of structured information from the page content
The extraction mode to use, currently supports 'llm-extraction'
A prompt describing what information to extract from the page
The schema for the data to be extracted
Timeout in milliseconds for the request
curl --request POST \
--url https://api.firecrawl.dev/v0/scrape \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"url": "<string>",
"pageOptions": {
"onlyMainContent": true,
"includeHtml": true
},
"extractorOptions": {
"mode": "llm-extraction",
"extractionPrompt": "<string>",
"extractionSchema": {}
},
"timeout": 123
}'
{
"success": true,
"data": {
"markdown": "<string>",
"content": "<string>",
"html": "<string>",
"metadata": {
"title": "<string>",
"description": "<string>",
"language": "<string>",
"sourceURL": "<string>"
}
}
}
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
The URL to scrape
Only return the main content of the page excluding headers, navs, footers, etc.
Include the raw HTML content of the page. Will output a html key in the response.
Options for LLM-based extraction of structured information from the page content
The extraction mode to use, currently supports 'llm-extraction'
llm-extraction
A prompt describing what information to extract from the page
The schema for the data to be extracted
Timeout in milliseconds for the request
Response
Raw HTML content of the page if includeHtml
is true
curl --request POST \
--url https://api.firecrawl.dev/v0/scrape \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"url": "<string>",
"pageOptions": {
"onlyMainContent": true,
"includeHtml": true
},
"extractorOptions": {
"mode": "llm-extraction",
"extractionPrompt": "<string>",
"extractionSchema": {}
},
"timeout": 123
}'
{
"success": true,
"data": {
"markdown": "<string>",
"content": "<string>",
"html": "<string>",
"metadata": {
"title": "<string>",
"description": "<string>",
"language": "<string>",
"sourceURL": "<string>"
}
}
}