> ## Documentation Index
> Fetch the complete documentation index at: https://docs.firecrawl.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Welcome to V1

> Firecrawl allows you to turn entire websites into LLM-ready markdown

Firecrawl V1 is here! With that we introduce a more reliable and developer friendly API.

Here is what's new:

* Output Formats for `/scrape`. Choose what formats you want your output in.
* New [`/map` endpoint](/features/map) for getting most of the URLs of a webpage.
* Developer friendly API for `/crawl/{id}` status.
* 2x Rate Limits for all plans.
* [Go SDK](/sdks/go) and [Rust SDK](/sdks/rust)
* Teams support
* API Key Management in the dashboard.
* `onlyMainContent` is now default to `true`.
* `/crawl` webhooks and websocket support.

## Scrape Formats

You can now choose what formats you want your output in. You can specify multiple output formats. Supported formats are:

* Markdown (markdown)
* HTML (html)
* Raw HTML (rawHtml) (with no modifications)
* Screenshot (screenshot or screenshot\@fullPage)
* Links (links)
* JSON (json) - structured output

Output keys will match the format you choose.

<CodeGroup>
  ```python Python theme={null}
  from firecrawl import FirecrawlApp

  app = FirecrawlApp(
    # No API key needed to get started — add one for higher rate limits:
    # api_key="fc-YOUR_API_KEY",
  )

  # Scrape a website:
  scrape_result = app.scrape_url('firecrawl.dev', formats=['markdown', 'html'])
  print(scrape_result)
  ```

  ```js Node theme={null}
  import { Firecrawl, ScrapeResponse } from 'firecrawl';

  const app = new Firecrawl({
    // No API key needed to get started — add one for higher rate limits:
    // apiKey: "fc-YOUR_API_KEY",
  });

  // Scrape a website:
  const scrapeResult = await app.scrapeUrl('firecrawl.dev', { formats: ['markdown', 'html'] }) as ScrapeResponse;

  if (!scrapeResult.success) {
    throw new Error(`Failed to scrape: ${scrapeResult.error}`)
  }

  console.log(scrapeResult)
  ```

  ```go Go theme={null}
  import (
  	"fmt"
  	"log"

  	"github.com/mendableai/firecrawl-go"
  )

  func main() {
  	apiKey := ""
  	apiUrl := "https://api.firecrawl.dev"
  	version := "v1"

  	// No API key needed to get started — add one for higher rate limits:
  	app, err := firecrawl.NewFirecrawlApp(apiKey, apiUrl, version)
  	if err != nil {
  		log.Fatalf("Failed to initialize FirecrawlApp: %v", err)
  	}

  	// Scrape a website
  	scrapeResult, err := app.ScrapeUrl("https://firecrawl.dev", map[string]any{
  		"formats": []string{"markdown", "html"},
  	})
  	if err != nil {
  		log.Fatalf("Failed to scrape URL: %v", err)
  	}

  	fmt.Println(scrapeResult)
  }
  ```

  ```rust Rust theme={null}
  use firecrawl::{FirecrawlApp, scrape::{ScrapeOptions, ScrapeFormats}};

  #[tokio::main]
  async fn main() {
      // No API key needed to get started — add one for higher rate limits:
      let app = FirecrawlApp::new("").expect("Failed to initialize FirecrawlApp");

      let options = ScrapeOptions {
          formats vec! [ ScrapeFormats::Markdown, ScrapeFormats::HTML ].into(),
          ..Default::default()
      };

      let scrape_result = app.scrape_url("https://firecrawl.dev", options).await;

      match scrape_result {
          Ok(data) => println!("Scrape Result:\n{}", data.markdown.unwrap()),
          Err(e) => eprintln!("Map failed: {}", e),
      }
  }
  ```

  ```bash cURL theme={null}
  # No API key needed to get started — add -H "Authorization: Bearer YOUR_API_KEY" for higher rate limits:
  curl -X POST https://api.firecrawl.dev/v1/scrape \
      -H 'Content-Type: application/json' \
      -d '{
        "url": "https://docs.firecrawl.dev",
        "formats" : ["markdown", "html"]
      }'
  ```
</CodeGroup>

### Response

SDKs will return the data object directly. cURL will return the payload exactly as shown below.

```json theme={null}
{
  "success": true,
  "data" : {
    "markdown": "Launch Week I is here! [See our Day 2 Release 🚀](https://www.firecrawl.dev/blog/launch-week-i-day-2-doubled-rate-limits)[💥 Get 2 months free...",
    "html": "<!DOCTYPE html><html lang=\"en\" class=\"light\" style=\"color-scheme: light;\"><body class=\"__variable_36bd41 __variable_d7dc5d font-inter ...",
    "metadata": {
      "title": "Home - Firecrawl",
      "description": "Firecrawl crawls and converts any website into clean markdown.",
      "language": "en",
      "keywords": "Firecrawl,Markdown,Data,Mendable,Langchain",
      "robots": "follow, index",
      "ogTitle": "Firecrawl",
      "ogDescription": "Turn any website into LLM-ready data.",
      "ogUrl": "https://www.firecrawl.dev/",
      "ogImage": "https://www.firecrawl.dev/og.png?123",
      "ogLocaleAlternate": [],
      "ogSiteName": "Firecrawl",
      "sourceURL": "https://firecrawl.dev",
      "statusCode": 200
    }
  }
}
```

## Introducing /map (Alpha)

The easiest way to go from a single url to a map of the entire website.

### Usage

<CodeGroup>
  ```python Python theme={null}
  from firecrawl import FirecrawlApp

  app = FirecrawlApp(api_key="fc-YOUR_API_KEY")

  # Map a website:
  map_result = app.map_url('https://firecrawl.dev')
  print(map_result)
  ```

  ```js Node theme={null}
  import { Firecrawl, MapResponse } from 'firecrawl';

  const app = new Firecrawl({apiKey: "fc-YOUR_API_KEY"});

  const mapResult = await app.mapUrl('https://firecrawl.dev') as MapResponse;

  if (!mapResult.success) {
      throw new Error(`Failed to map: ${mapResult.error}`)
  }

  console.log(mapResult)
  ```

  ```go Go theme={null}
  import (
  	"fmt"
  	"log"

  	"github.com/mendableai/firecrawl-go"
  )

  func main() {
  	// Initialize the FirecrawlApp with your API key
  	apiKey := "fc-YOUR_API_KEY"
  	apiUrl := "https://api.firecrawl.dev"
  	version := "v1"

  	app, err := firecrawl.NewFirecrawlApp(apiKey, apiUrl, version)
  	if err != nil {
  		log.Fatalf("Failed to initialize FirecrawlApp: %v", err)
  	}

  	// Map a website
  	mapResult, err := app.MapUrl("https://firecrawl.dev", nil)
  	if err != nil {
  		log.Fatalf("Failed to map URL: %v", err)
  	}

  	fmt.Println(mapResult)
  }
  ```

  ```rust Rust theme={null}
  use firecrawl::FirecrawlApp;

  #[tokio::main]
  async fn main() {
      // Initialize the FirecrawlApp with the API key
      let app = FirecrawlApp::new("fc-YOUR_API_KEY").expect("Failed to initialize FirecrawlApp");

      let map_result = app.map_url("https://firecrawl.dev", None).await;

      match map_result {
          Ok(data) => println!("Mapped URLs: {:#?}", data),
          Err(e) => eprintln!("Map failed: {}", e),
      }
  }
  ```

  ```bash cURL theme={null}
  curl -X POST https://api.firecrawl.dev/v1/map \
      -H 'Content-Type: application/json' \
      -H 'Authorization: Bearer YOUR_API_KEY' \
      -d '{
        "url": "https://firecrawl.dev"
      }'
  ```
</CodeGroup>

### Response

SDKs will return the data object directly. cURL will return the payload exactly as shown below.

```json theme={null}
{
  "status": "success",
  "links": [
    "https://firecrawl.dev",
    "https://www.firecrawl.dev/pricing",
    "https://www.firecrawl.dev/blog",
    "https://www.firecrawl.dev/playground",
    "https://www.firecrawl.dev/smart-crawl",
    ...
  ]
}
```

## WebSockets

To crawl a website with WebSockets, use the `Crawl URL and Watch` method.

<CodeGroup>
  ```python Python theme={null}
  # inside an async function...
  nest_asyncio.apply()

  # Define event handlers
  def on_document(detail):
      print("DOC", detail)

  def on_error(detail):
      print("ERR", detail['error'])

  def on_done(detail):
      print("DONE", detail['status'])

      # Function to start the crawl and watch process
  async def start_crawl_and_watch():
      # Initiate the crawl job and get the watcher
      watcher = app.crawl_url_and_watch('firecrawl.dev', limit=5)

      # Add event listeners
      watcher.add_event_listener("document", on_document)
      watcher.add_event_listener("error", on_error)
      watcher.add_event_listener("done", on_done)

      # Start the watcher
      await watcher.connect()

  # Run the event loop
  await start_crawl_and_watch()
  ```

  ```js Node theme={null}
  const watch = await app.crawlUrlAndWatch('mendable.ai', { excludePaths: ['blog/*'], limit: 5});

  watch.addEventListener("document", doc => {
    console.log("DOC", doc.detail);
  });

  watch.addEventListener("error", err => {
    console.error("ERR", err.detail.error);
  });

  watch.addEventListener("done", state => {
    console.log("DONE", state.detail.status);
  });
  ```
</CodeGroup>

## JSON format

LLM extraction is now available in v1 under the `json` format. To extract structured from a page, you can pass a schema to the endpoint or just provide a prompt.

<CodeGroup>
  ```python Python theme={null}
  from firecrawl import JsonConfig, FirecrawlApp
  from pydantic import BaseModel
  app = FirecrawlApp(api_key="<YOUR_API_KEY>")

  class ExtractSchema(BaseModel):
      company_mission: str
      supports_sso: bool
      is_open_source: bool
      is_in_yc: bool

  json_config = JsonConfig(
      schema=ExtractSchema
  )

  llm_extraction_result = app.scrape_url(
      'https://firecrawl.dev',
      formats=["json"],
      json_options=json_config,
      only_main_content=False,
      timeout=120000
  )

  print(llm_extraction_result.json)
  ```

  ```js Node theme={null}
  import { Firecrawl } from "firecrawl";
  import { z } from "zod";

  const app = new Firecrawl({
    apiKey: "fc-YOUR_API_KEY"
  });

  // Define schema to extract contents into
  const schema = z.object({
    company_mission: z.string(),
    supports_sso: z.boolean(),
    is_open_source: z.boolean(),
    is_in_yc: z.boolean()
  });

  const scrapeResult = await app.scrapeUrl("https://docs.firecrawl.dev/", {
    formats: ["json"],
    jsonOptions: { schema: schema }
  });

  if (!scrapeResult.success) {
    throw new Error(`Failed to scrape: ${scrapeResult.error}`)
  }

  console.log(scrapeResult.json);
  ```

  ```bash cURL theme={null}
  curl -X POST https://api.firecrawl.dev/v1/scrape \
      -H 'Content-Type: application/json' \
      -H 'Authorization: Bearer YOUR_API_KEY' \
      -d '{
        "url": "https://docs.firecrawl.dev/",
        "formats": ["json"],
        "jsonOptions": {
          "schema": {
            "type": "object",
            "properties": {
              "company_mission": {
                        "type": "string"
              },
              "supports_sso": {
                        "type": "boolean"
              },
              "is_open_source": {
                        "type": "boolean"
              },
              "is_in_yc": {
                        "type": "boolean"
              }
            },
            "required": [
              "company_mission",
              "supports_sso",
              "is_open_source",
              "is_in_yc"
            ]
          }
        }
      }'
  ```
</CodeGroup>

Output:

```json JSON theme={null}
{
    "success": true,
    "data": {
      "json": {
        "company_mission": "AI-powered web scraping and data extraction",
        "supports_sso": true,
        "is_open_source": true,
        "is_in_yc": true
      },
      "metadata": {
        "title": "Firecrawl",
        "description": "AI-powered web scraping and data extraction",
        "robots": "follow, index",
        "ogTitle": "Firecrawl",
        "ogDescription": "AI-powered web scraping and data extraction",
        "ogUrl": "https://firecrawl.dev/",
        "ogImage": "https://firecrawl.dev/og.png",
        "ogLocaleAlternate": [],
        "ogSiteName": "Firecrawl",
        "sourceURL": "https://firecrawl.dev/"
      },
    }
}
```

### Extracting without schema (New)

You can now extract without a schema by just passing a `prompt` to the endpoint. The llm chooses the structure of the data.

<CodeGroup>
  ```bash cURL theme={null}
  curl -X POST https://api.firecrawl.dev/v1/scrape \
      -H 'Content-Type: application/json' \
      -H 'Authorization: Bearer YOUR_API_KEY' \
      -d '{
        "url": "https://docs.firecrawl.dev/",
        "formats": ["json"],
        "jsonOptions": {
          "prompt": "Extract the company mission from the page."
        }
      }'
  ```
</CodeGroup>

Output:

```json JSON theme={null}
{
    "success": true,
    "data": {
      "json": {
        "company_mission": "AI-powered web scraping and data extraction",
      },
      "metadata": {
        "title": "Firecrawl",
        "description": "AI-powered web scraping and data extraction",
        "robots": "follow, index",
        "ogTitle": "Firecrawl",
        "ogDescription": "AI-powered web scraping and data extraction",
        "ogUrl": "https://firecrawl.dev/",
        "ogImage": "https://firecrawl.dev/og.png",
        "ogLocaleAlternate": [],
        "ogSiteName": "Firecrawl",
        "sourceURL": "https://firecrawl.dev/"
      },
    }
}
```

## New Crawl Webhook

You can now pass a `webhook` parameter to the `/crawl` endpoint. This will send a POST request to the URL you specify when the crawl is started, updated and completed.

The webhook will now trigger for every page crawled and not just the whole result at the end.

```bash cURL theme={null}
curl -X POST https://api.firecrawl.dev/v2/crawl \
    -H 'Content-Type: application/json' \
    -H 'Authorization: Bearer YOUR_API_KEY' \
    -d '{
      "url": "https://docs.firecrawl.dev",
      "limit": 100,
      "webhook": {
        "url": "https://your-domain.com/webhook",
        "metadata": {
          "any_key": "any_value"
        },
        "events": ["started", "page", "completed"]
      }
    }'
```

### Webhook Events

There are now 4 types of events:

* `crawl.started` - Triggered when the crawl is started.
* `crawl.page` - Triggered for every page crawled.
* `crawl.completed` - Triggered when the crawl is completed to let you know it's done.
* `crawl.failed` - Triggered when the crawl fails.

### Webhook Response

* `success` - If the webhook was successful in crawling the page correctly.
* `type` - The type of event that occurred.
* `id` - The ID of the crawl.
* `data` - The data that was scraped (Array). This will only be non empty on `crawl.page` and will contain 1 item if the page was scraped successfully. The response is the same as the `/scrape` endpoint.
* `error` - If the webhook failed, this will contain the error message.

## Migrating from V0

> ⚠️ **Deprecation Notice**: V0 endpoints will be deprecated on April 1st, 2025. Please migrate to V1 endpoints before then to ensure uninterrupted service.

## /scrape endpoint

The updated `/scrape` endpoint has been redesigned for enhanced reliability and ease of use. The structure of the new `/scrape` request body is as follows:

```json theme={null}
{
  "url": "<string>",
  "formats": ["markdown", "html", "rawHtml", "links", "screenshot", "json"],
  "includeTags": ["<string>"],
  "excludeTags": ["<string>"],
  "headers": { "<key>": "<value>" },
  "waitFor": 123,
  "timeout": 123
}
```

### Formats

You can now choose what formats you want your output in. You can specify multiple output formats. Supported formats are:

* Markdown (markdown)
* HTML (html)
* Raw HTML (rawHtml) (with no modifications)
* Screenshot (screenshot or screenshot\@fullPage)
* Links (links)
* JSON (json)

By default, the output will be include only the markdown format.

### Details on the new request body

The table below outlines the changes to the request body parameters for the `/scrape` endpoint in V1.

| Parameter                          | Change            | Description                                                                                           |
| ---------------------------------- | ----------------- | ----------------------------------------------------------------------------------------------------- |
| `onlyIncludeTags`                  | Moved and Renamed | Moved to root level. And renamed to `includeTags`.                                                    |
| `removeTags`                       | Moved and Renamed | Moved to root level. And renamed to `excludeTags`.                                                    |
| `onlyMainContent`                  | Moved             | Moved to root level. `true` by default.                                                               |
| `waitFor`                          | Moved             | Moved to root level.                                                                                  |
| `headers`                          | Moved             | Moved to root level.                                                                                  |
| `parsePDF`                         | Moved             | Moved to root level.                                                                                  |
| `extractorOptions`                 | No Change         |                                                                                                       |
| `timeout`                          | No Change         |                                                                                                       |
| `pageOptions`                      | Removed           | No need for `pageOptions` parameter. The scrape options were moved to root level.                     |
| `replaceAllPathsWithAbsolutePaths` | Removed           | `replaceAllPathsWithAbsolutePaths` is not needed anymore. Every path is now default to absolute path. |
| `includeHtml`                      | Removed           | add `"html"` to `formats` instead.                                                                    |
| `includeRawHtml`                   | Removed           | add `"rawHtml"` to `formats` instead.                                                                 |
| `screenshot`                       | Removed           | add `"screenshot"` to `formats` instead.                                                              |
| `fullPageScreenshot`               | Removed           | add `"screenshot@fullPage"` to `formats` instead.                                                     |
| `extractorOptions`                 | Removed           | Use `"json"` format instead with `jsonOptions` object.                                                |

The new `json` format is described in the [llm-extract](/features/extract) section.

## /crawl endpoint

We've also updated the `/crawl` endpoint on `v1`. Check out the improved body request below:

```json theme={null}
{
  "url": "<string>",
  "excludePaths": ["<string>"],
  "includePaths": ["<string>"],
  "maxDepth": 2,
  "ignoreSitemap": true,
  "limit": 10,
  "allowBackwardLinks": true,
  "allowExternalLinks": true,
  "scrapeOptions": {
    // same options as in /scrape
    "formats": ["markdown", "html", "rawHtml", "screenshot", "links"],
    "headers": { "<key>": "<value>" },
    "includeTags": ["<string>"],
    "excludeTags": ["<string>"],
    "onlyMainContent": true,
    "waitFor": 123
  }
}
```

### Details on the new request body

The table below outlines the changes to the request body parameters for the `/crawl` endpoint in V1.

| Parameter               | Change            | Description                                                                         |
| ----------------------- | ----------------- | ----------------------------------------------------------------------------------- |
| `pageOptions`           | Renamed           | Renamed to `scrapeOptions`.                                                         |
| `includes`              | Moved and Renamed | Moved to root level. Renamed to `includePaths`.                                     |
| `excludes`              | Moved and Renamed | Moved to root level. Renamed to `excludePaths`.                                     |
| `allowBackwardCrawling` | Moved and Renamed | Moved to root level. Renamed to `allowBackwardLinks`.                               |
| `allowExternalLinks`    | Moved             | Moved to root level.                                                                |
| `maxDepth`              | Moved             | Moved to root level.                                                                |
| `ignoreSitemap`         | Moved             | Moved to root level.                                                                |
| `limit`                 | Moved             | Moved to root level.                                                                |
| `crawlerOptions`        | Removed           | No need for `crawlerOptions` parameter. The crawl options were moved to root level. |
| `timeout`               | Removed           | Use `timeout` in `scrapeOptions` instead.                                           |
