Get Crawl Status
This endpoint retrieves the status of a crawl job. If the job is not completed, the response includes content within partial_data
. Once the job is completed, the content is available under data
.
We recommend keeping track of the crawl jobs yourself as the crawl status results can expire after 24 hours.
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Path Parameters
ID of the crawl job
Response
Status of the job (completed, active, failed, paused)
Current page number
Total number of pages
Data returned from the job (null when it is in progress)
Partial documents returned as it is being crawled (streaming). This feature is currently in alpha - expect breaking changes When a page is ready, it will append to the partial_data array, so there is no need to wait for the entire website to be crawled. When the crawl is done, partial_data will become empty and the result will be available in data
. There is a max of 50 items in the array response. The oldest item (top of the array) will be removed when the new item is added to the array.