POST
/
crawl
Crawl multiple URLs based on options
curl --request POST \
  --url https://api.firecrawl.dev/v2/crawl \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "url": "<string>",
  "prompt": "<string>",
  "excludePaths": [
    "<string>"
  ],
  "includePaths": [
    "<string>"
  ],
  "maxDiscoveryDepth": 123,
  "sitemap": "include",
  "ignoreQueryParameters": false,
  "limit": 10000,
  "crawlEntireDomain": false,
  "allowExternalLinks": false,
  "allowSubdomains": false,
  "delay": 123,
  "maxConcurrency": 123,
  "webhook": {
    "url": "<string>",
    "headers": {},
    "metadata": {},
    "events": [
      "completed"
    ]
  },
  "scrapeOptions": {
    "formats": [
      "markdown"
    ],
    "onlyMainContent": true,
    "includeTags": [
      "<string>"
    ],
    "excludeTags": [
      "<string>"
    ],
    "maxAge": 172800000,
    "headers": {},
    "waitFor": 0,
    "mobile": false,
    "skipTlsVerification": true,
    "timeout": 123,
    "parsers": [
      "pdf"
    ],
    "actions": [
      {
        "type": "wait",
        "milliseconds": 2,
        "selector": "#my-element"
      }
    ],
    "location": {
      "country": "US",
      "languages": [
        "en-US"
      ]
    },
    "removeBase64Images": true,
    "blockAds": true,
    "proxy": "auto",
    "storeInCache": true
  },
  "zeroDataRetention": false
}'
{
  "success": true,
  "id": "<string>",
  "url": "<string>"
}

Nouveautés de la v2

Indiquez au crawl ce que vous voulez

Décrivez en anglais simple ce que vous souhaitez lancer en crawl :
{
  "url": "https://example.com",
  "prompt": "Ne parcourir que les articles de blog et la documentation, ignorer les pages marketing"
}
Cela associera le prompt à un ensemble de paramètres d’exploration pour lancer le crawl.

Contrôle amélioré du sitemap

Dans la v1, l’utilisation du sitemap était un booléen. Dans la v2, l’option sitemap vous permet de choisir :
  • "include" (par défaut) : utiliser le sitemap tout en découvrant d’autres pages.
  • "skip" : ignorer complètement le sitemap.

Nouvelles options de crawl

  • crawlEntireDomain - Crawler l’ensemble du domaine, pas seulement les pages enfants
  • maxDiscoveryDepth - Contrôler la profondeur de crawl (remplace maxDepth)
{
  "url": "https://example.com/features",
  "crawlEntireDomain": true,
  "maxDiscoveryDepth": 2,
  "sitemap": "include"
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
url
string<uri>
required

The base URL to start crawling from

prompt
string

A prompt to use to generate the crawler options (all the parameters below) from natural language. Explicitly set parameters will override the generated equivalents.

excludePaths
string[]

URL pathname regex patterns that exclude matching URLs from the crawl. For example, if you set "excludePaths": ["blog/.*"] for the base URL firecrawl.dev, any results matching that pattern will be excluded, such as https://www.firecrawl.dev/blog/firecrawl-launch-week-1-recap.

includePaths
string[]

URL pathname regex patterns that include matching URLs in the crawl. Only the paths that match the specified patterns will be included in the response. For example, if you set "includePaths": ["blog/.*"] for the base URL firecrawl.dev, only results matching that pattern will be included, such as https://www.firecrawl.dev/blog/firecrawl-launch-week-1-recap.

maxDiscoveryDepth
integer

Maximum depth to crawl based on discovery order. The root site and sitemapped pages has a discovery depth of 0. For example, if you set it to 1, and you set sitemap: 'skip', you will only crawl the entered URL and all URLs that are linked on that page.

sitemap
enum<string>
default:include

Sitemap mode when crawling. If you set it to 'skip', the crawler will ignore the website sitemap and only crawl the entered URL and discover pages from there onwards.

Available options:
skip,
include
ignoreQueryParameters
boolean
default:false

Do not re-scrape the same path with different (or none) query parameters

limit
integer
default:10000

Maximum number of pages to crawl. Default limit is 10000.

crawlEntireDomain
boolean
default:false

Allows the crawler to follow internal links to sibling or parent URLs, not just child paths.

false: Only crawls deeper (child) URLs. → e.g. /features/feature-1 → /features/feature-1/tips ✅ → Won't follow /pricing or / ❌

true: Crawls any internal links, including siblings and parents. → e.g. /features/feature-1 → /pricing, /, etc. ✅

Use true for broader internal coverage beyond nested paths.

Allows the crawler to follow links to external websites.

allowSubdomains
boolean
default:false

Allows the crawler to follow links to subdomains of the main domain.

delay
number

Delay in seconds between scrapes. This helps respect website rate limits.

maxConcurrency
integer

Maximum number of concurrent scrapes. This parameter allows you to set a concurrency limit for this crawl. If not specified, the crawl adheres to your team's concurrency limit.

webhook
object

A webhook specification object.

scrapeOptions
object
zeroDataRetention
boolean
default:false

If true, this will enable zero data retention for this crawl. To enable this feature, please contact help@firecrawl.dev

Response

Successful response

success
boolean
id
string
url
string<uri>