Firecrawl can track changes between the current page and a previous version, and tell you if it updated or not
changeTracking
format, you can monitor changes on a website and receive information about:
previousScrapeAt
: The timestamp of the previous scrape that the current page is being compared against (null
if no previous scrape)changeStatus
: The result of the comparison between the two page versions
new
: This page did not exist or was not discovered before (usually has a null
previousScrapeAt
)same
: This page’s content has not changed since the last scrapechanged
: This page’s content has changed since the last scraperemoved
: This page was removed since the last scrapevisibility
: The visibility of the current page/URL
visible
: This page is visible, meaning that its URL was discovered through an organic route (through links on other visible pages or the sitemap)hidden
: This page is not visible, meaning it is still available on the web, but no longer discoverable via the sitemap or crawling the site. We can only identify invisible links if they had been visible, and captured, during a previous crawl or scrape'changeTracking'
in the formats array when scraping a URL:
'changeTracking'
in the formats list when scraping a URL:
git-diff
mode provides a traditional diff format similar to Git’s output. It shows line-by-line changes with additions and deletions marked.
Example output:
files
: Array of changed files (in web context, typically just one)chunks
: Sections of changes within a filechanges
: Individual line changes with type (add, delete, normal)json
mode provides a structured comparison of specific fields extracted from the content. This is useful for tracking changes in specific data points rather than the entire content.
Example output:
markdown
format must also be specified when using the changeTracking
format. Other formats may also be specified in addition.markdown
format, and the tag
parameter.
includePaths
/excludePaths
will have inconsistencies when using changeTracking
.includeTags
/excludeTags
/onlyMainContent
will have inconsistencies when using changeTracking
.markdown
format without the changeTracking
format.changeStatus
will always be new
, even if other Firecrawl users have scraped it before.warning
field of the resulting document, and to handle the changeTracking
object potentially missing from the response.
git-diff
mode has no additional cost. However, if you use the json
mode for structured data comparison, the page scrape will cost 5 credits per page.