Extract structured data from pages using LLMs
/extract
endpoint simplifies collecting structured data from any number of URLs or entire domains. Provide a list of URLs, optionally with wildcards (e.g., example.com/*
), and a prompt or schema describing the information you want. Firecrawl handles the details of crawling, parsing, and collating large or small datasets.
/extract
https://firecrawl.dev/some-page
https://firecrawl.dev/*
/*
, Firecrawl will automatically crawl and parse all URLs it can discover in that domain, then extract the requested data. This feature is experimental; email help@firecrawl.com if you have issues.
/*
) for broader crawling.true
, extraction can follow links outside the specified domain.prompt
. The underlying model will choose a structure for you, which can be useful for more exploratory or flexible requests.
enableWebSearch = true
in your request will expand the crawl beyond the provided URL set. This can capture supporting or related information from linked pages.
Here’s an example that extracts information about dash cams, enriching the results with data from related pages:
/extract
endpoint now supports extracting structured data using a prompt without needing specific URLs. This is useful for research or when exact URLs are unknown. Currently in Alpha.
/extract
is still in Beta, features and performance will continue to evolve. We welcome bug reports and feedback to help us improve.
/v1/extract
endpoint for complex extraction tasks that require navigation across multiple pages or interaction with elements.
Example (cURL):
FIRE-1 is already live and available under preview.