Webhooks allow you to receive real-time notifications about your Firecrawl operations as they progress. Instead of polling for status updates, Firecrawl will automatically send HTTP POST requests to your specified endpoint when events occur.

Overview

Webhooks are supported for:

  • Crawl operations - Get notified as pages are crawled and when crawls complete
  • Batch scrape operations - Receive updates for each URL scraped in a batch

Basic Configuration

Configure webhooks by adding a webhook object to your request:

JSON
{
  "webhook": {
    "url": "https://your-domain.com/webhook",
    "metadata": {
      "any_key": "any_value"
    },
    "events": ["started", "page", "completed", "failed"]
  }
}

Configuration Options

FieldTypeRequiredDescription
urlstringYour webhook endpoint URL
headersobjectCustom headers to include in webhook requests
metadataobjectCustom data included in all webhook payloads
eventsarrayEvent types to receive (default: all events)

Event Types

Crawl Events

EventDescriptionWhen Triggered
crawl.startedCrawl job initiatedWhen crawl begins
crawl.pageIndividual page scrapedAfter each page is successfully scraped
crawl.completedCrawl finished successfullyWhen all pages are processed
crawl.failedCrawl encountered an errorWhen crawl fails or is cancelled

Batch Scrape Events

EventDescriptionWhen Triggered
batch_scrape.startedBatch scrape job initiatedWhen batch scrape begins
batch_scrape.pageIndividual URL scrapedAfter each URL is successfully scraped
batch_scrape.completedBatch scrape finishedWhen all URLs are processed
batch_scrape.failedBatch scrape failedWhen batch scrape fails or is cancelled

Webhook Payload Structure

All webhook payloads follow this structure:

{
  "success": true,
  "type": "crawl.page",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "data": [...],
  "metadata": {
    "user_id": "12345",
    "project": "web-scraping"
  },
  "error": null,
  "timestamp": "2024-01-15T10:30:00Z"
}

Payload Fields

FieldTypeDescription
successbooleanWhether the operation was successful
typestringEvent type (e.g., crawl.page, batch_scrape.completed)
idstringUnique identifier for the crawl/batch scrape job
dataarrayScraped content (populated for page events)
metadataobjectCustom metadata from your webhook configuration
errorstringError message (present when success is false)
timestampstringISO 8601 timestamp of when the event occurred

Examples

Crawl with Webhook

cURL
curl -X POST https://api.firecrawl.dev/v1/crawl \
    -H 'Content-Type: application/json' \
    -H 'Authorization: Bearer YOUR_API_KEY' \
    -d '{
      "url": "https://docs.firecrawl.dev",
      "limit": 100,
      "webhook": {
        "url": "https://your-domain.com/webhook",
        "metadata": {
          "any_key": "any_value"
        },
        "events": ["started", "page", "completed"]
      }
    }'

Batch Scrape with Webhook

cURL
curl -X POST https://api.firecrawl.dev/v1/batch/scrape \
    -H 'Content-Type: application/json' \
    -H 'Authorization: Bearer YOUR_API_KEY' \
    -d '{
      "urls": [
        "https://example.com/page1",
        "https://example.com/page2",
        "https://example.com/page3"
      ],
      "webhook": {
        "url": "https://your-domain.com/webhook",
        "metadata": {
          "any_key": "any_value"
        },
        "events": ["started", "page", "completed"]
      }
    }'

Webhook Endpoint Example

Here’s how to handle webhooks in your application:

const express = require('express');
const app = express();

app.post('/webhook', express.json(), (req, res) => {
  const { success, type, id, data, metadata, error } = req.body;
  
  switch (type) {
    case 'crawl.started':
    case 'batch_scrape.started':
      console.log(`${type.split('.')[0]} ${id} started`);
      break;
      
    case 'crawl.page':
    case 'batch_scrape.page':
      if (success && data.length > 0) {
        console.log(`Page scraped: ${data[0].metadata.sourceURL}`);
        // Process the scraped page data
        processScrapedPage(data[0]);
      }
      break;
      
    case 'crawl.completed':
    case 'batch_scrape.completed':
      console.log(`${type.split('.')[0]} ${id} completed successfully`);
      break;
      
    case 'crawl.failed':
    case 'batch_scrape.failed':
      console.error(`${type.split('.')[0]} ${id} failed: ${error}`);
      break;
  }
  
  // Always respond with 200 to acknowledge receipt
  res.status(200).send('OK');
});

function processScrapedPage(pageData) {
  // Your processing logic here
  console.log('Processing:', pageData.metadata.title);
}

app.listen(3000, () => {
  console.log('Webhook server listening on port 3000');
});

Event-Specific Payloads

started Events

{
  "success": true,
  "type": "crawl.started",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "data": [],
  "metadata": {
    "user_id": "12345",
    "project": "web-scraping"
  },
  "error": null
}

page Events

{
  "success": true,
  "type": "crawl.page", 
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "data": [
    {
      "markdown": "# Page Title\n\nPage content...",
      "metadata": {
        "title": "Page Title",
        "description": "Page description",
        "sourceURL": "https://example.com/page1",
        "statusCode": 200
      }
    }
  ],
  "metadata": {
    "any_key": "any_value"
  },
  "error": null
}

completed Events

{
  "success": true,
  "type": "crawl.completed",
  "id": "550e8400-e29b-41d4-a716-446655440000", 
  "data": [],
  "metadata": {
    "any_key": "any_value"
  },
  "error": null
}

failed Events

{
  "success": false,
  "type": "crawl.failed",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "data": [],
  "metadata": {
    "any_key": "any_value"
  },
  "error": "Error message"
}

Monitoring and Debugging

Testing Your Webhook

Use tools like ngrok for local development:

# Expose local server
ngrok http 3000

# Use the ngrok URL in your webhook configuration
# https://abc123.ngrok.io/webhook

Webhook Logs

Monitor webhook delivery in your application:

app.post('/webhook', (req, res) => {
  console.log('Webhook received:', {
    timestamp: new Date().toISOString(),
    type: req.body.type,
    id: req.body.id,
    success: req.body.success
  });
  
  res.status(200).send('OK');
});

Common Issues

Webhook Not Receiving Events

  1. Check URL accessibility - Ensure your endpoint is publicly accessible
  2. Verify HTTPS - Webhook URLs must use HTTPS
  3. Check firewall settings - Allow incoming connections to your webhook port
  4. Review event filters - Ensure you’re subscribed to the correct event types