获取爬取状态 - Firecrawl Docs

获取爬取任务状态

curl --request GET \
  --url https://api.firecrawl.dev/v1/crawl/{id} \
  --header 'Authorization: Bearer <token>'

{
  "completed": 123,
  "creditsUsed": 123,
  "data": [
    {
      "html": "<string>",
      "links": [
        "<string>"
      ],
      "markdown": "<string>",
      "metadata": {
        "<any other metadata> ": "<string>",
        "description": "<string>",
        "error": "<string>",
        "keywords": "<string>",
        "language": "<string>",
        "ogLocaleAlternate": [
          "<string>"
        ],
        "sourceURL": "<string>",
        "statusCode": 123,
        "title": "<string>"
      },
      "rawHtml": "<string>",
      "screenshot": "<string>"
    }
  ],
  "expiresAt": "2023-11-07T05:31:56Z",
  "next": "<string>",
  "status": "<string>",
  "total": 123
}

GET

crawl

{id}

获取爬取任务状态

curl --request GET \
  --url https://api.firecrawl.dev/v1/crawl/{id} \
  --header 'Authorization: Bearer <token>'

{
  "completed": 123,
  "creditsUsed": 123,
  "data": [
    {
      "html": "<string>",
      "links": [
        "<string>"
      ],
      "markdown": "<string>",
      "metadata": {
        "<any other metadata> ": "<string>",
        "description": "<string>",
        "error": "<string>",
        "keywords": "<string>",
        "language": "<string>",
        "ogLocaleAlternate": [
          "<string>"
        ],
        "sourceURL": "<string>",
        "statusCode": 123,
        "title": "<string>"
      },
      "rawHtml": "<string>",
      "screenshot": "<string>"
    }
  ],
  "expiresAt": "2023-11-07T05:31:56Z",
  "next": "<string>",
  "status": "<string>",
  "total": 123
}

注意：此 API 的全新 v2 版本现已推出，功能和性能均有所提升。

授权

Authorization

string

header

必填

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

路径参数

string<uuid>

必填

抓取任务 ID

响应

成功的响应

completed

integer

成功爬取的页面数量。

creditsUsed

integer

本次爬取所消耗的额度数。

data

object[]

爬取数据。

Show child attributes

expiresAt

string<date-time>

抓取任务到期的日期和时间。

string | null

用于获取后续 10MB 数据的 URL。如果抓取尚未完成或响应大小超过 10MB，则会返回该字段。

status

string

当前爬取任务的状态。可能为 scraping、completed 或 failed 之一。

total

integer

尝试爬取的页面总数。

爬取

取消抓取