GET
/
crawl
/
status
/
{jobId}

This endpoint retrieves the status of a crawl job. If the job is not completed, the response includes content within partial_data. Once the job is completed, the content is available under data.

We recommend keeping track of the crawl jobs yourself as the crawl status results can expire after 24 hours.

Authorizations

Authorization
string
headerrequired

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

jobId
string
required

ID of the crawl job

Response

200 - application/json
status
string

Status of the job (completed, active, failed, paused)

current
integer

Current page number

total
integer

Total number of pages

data
object[]

Data returned from the job (null when it is in progress)

partial_data
object[]

Partial documents returned as it is being crawled (streaming). This feature is currently in alpha - expect breaking changes When a page is ready, it will append to the partial_data array, so there is no need to wait for the entire website to be crawled. When the crawl is done, partial_data will become empty and the result will be available in data. There is a max of 50 items in the array response. The oldest item (top of the array) will be removed when the new item is added to the array.