获取批量抓取任务状态

获取批量抓取作业的状态

curl --request GET \
  --url https://api.firecrawl.dev/v1/batch/scrape/{id} \
  --header 'Authorization: Bearer <token>'

{
  "completed": 123,
  "creditsUsed": 123,
  "data": [
    {
      "html": "<string>",
      "links": [
        "<string>"
      ],
      "markdown": "<string>",
      "metadata": {
        "<any other metadata> ": "<string>",
        "description": "<string>",
        "error": "<string>",
        "keywords": "<string>",
        "language": "<string>",
        "ogLocaleAlternate": [
          "<string>"
        ],
        "sourceURL": "<string>",
        "statusCode": 123,
        "title": "<string>"
      },
      "rawHtml": "<string>",
      "screenshot": "<string>"
    }
  ],
  "expiresAt": "2023-11-07T05:31:56Z",
  "next": "<string>",
  "status": "<string>",
  "total": 123
}

GET

batch

scrape

{id}

获取批量抓取作业的状态

curl --request GET \
  --url https://api.firecrawl.dev/v1/batch/scrape/{id} \
  --header 'Authorization: Bearer <token>'

{
  "completed": 123,
  "creditsUsed": 123,
  "data": [
    {
      "html": "<string>",
      "links": [
        "<string>"
      ],
      "markdown": "<string>",
      "metadata": {
        "<any other metadata> ": "<string>",
        "description": "<string>",
        "error": "<string>",
        "keywords": "<string>",
        "language": "<string>",
        "ogLocaleAlternate": [
          "<string>"
        ],
        "sourceURL": "<string>",
        "statusCode": 123,
        "title": "<string>"
      },
      "rawHtml": "<string>",
      "screenshot": "<string>"
    }
  ],
  "expiresAt": "2023-11-07T05:31:56Z",
  "next": "<string>",
  "status": "<string>",
  "total": 123
}

注意：现已提供此 API 的 v2 新版本，具备更完善的状态跟踪和监控能力。

授权

Authorization

string

header

必填

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

路径参数

string<uuid>

必填

批量抓取作业的 ID

响应

成功响应

completed

integer

已成功抓取的页面数量。

creditsUsed

integer

该批量抓取所使用的积分数量。

data

object[]

批量爬取的数据。

Show child attributes

expiresAt

string<date-time>

批量抓取任务的到期时间。

string | null

用于获取后续 10MB 数据的 URL。如果批量抓取任务尚未完成，或响应数据大于 10MB，则会返回此 URL。

status

string

批量抓取的当前状态。状态可能为 scraping、completed 或 failed。

total

integer

尝试抓取的页面总数。

批量抓取

取消批量抓取