> ## Documentation Index > Fetch the complete documentation index at: https://docs.firecrawl.dev/llms.txt > Use this file to discover all available pages before exploring further. # Welcome to V1 > Firecrawl allows you to turn entire websites into LLM-ready markdown Firecrawl V1 is here! With that we introduce a more reliable and developer friendly API. Here is what's new: * Output Formats for `/scrape`. Choose what formats you want your output in. * New [`/map` endpoint](/features/map) for getting most of the URLs of a webpage. * Developer friendly API for `/crawl/{id}` status. * 2x Rate Limits for all plans. * [Go SDK](/sdks/go) and [Rust SDK](/sdks/rust) * Teams support * API Key Management in the dashboard. * `onlyMainContent` is now default to `true`. * `/crawl` webhooks and websocket support. ## Scrape Formats You can now choose what formats you want your output in. You can specify multiple output formats. Supported formats are: * Markdown (markdown) * HTML (html) * Raw HTML (rawHtml) (with no modifications) * Screenshot (screenshot or screenshot\@fullPage) * Links (links) * JSON (json) - structured output Output keys will match the format you choose. ```python Python theme={null} from firecrawl import FirecrawlApp app = FirecrawlApp( # No API key needed to get started — add one for higher rate limits: # api_key="fc-YOUR_API_KEY", ) # Scrape a website: scrape_result = app.scrape_url('firecrawl.dev', formats=['markdown', 'html']) print(scrape_result) ``` ```js Node theme={null} import { Firecrawl, ScrapeResponse } from 'firecrawl'; const app = new Firecrawl({ // No API key needed to get started — add one for higher rate limits: // apiKey: "fc-YOUR_API_KEY", }); // Scrape a website: const scrapeResult = await app.scrapeUrl('firecrawl.dev', { formats: ['markdown', 'html'] }) as ScrapeResponse; if (!scrapeResult.success) { throw new Error(`Failed to scrape: ${scrapeResult.error}`) } console.log(scrapeResult) ``` ```go Go theme={null} import ( "fmt" "log" "github.com/mendableai/firecrawl-go" ) func main() { apiKey := "" apiUrl := "https://api.firecrawl.dev" version := "v1" // No API key needed to get started — add one for higher rate limits: app, err := firecrawl.NewFirecrawlApp(apiKey, apiUrl, version) if err != nil { log.Fatalf("Failed to initialize FirecrawlApp: %v", err) } // Scrape a website scrapeResult, err := app.ScrapeUrl("https://firecrawl.dev", map[string]any{ "formats": []string{"markdown", "html"}, }) if err != nil { log.Fatalf("Failed to scrape URL: %v", err) } fmt.Println(scrapeResult) } ``` ```rust Rust theme={null} use firecrawl::{FirecrawlApp, scrape::{ScrapeOptions, ScrapeFormats}}; #[tokio::main] async fn main() { // No API key needed to get started — add one for higher rate limits: let app = FirecrawlApp::new("").expect("Failed to initialize FirecrawlApp"); let options = ScrapeOptions { formats vec! [ ScrapeFormats::Markdown, ScrapeFormats::HTML ].into(), ..Default::default() }; let scrape_result = app.scrape_url("https://firecrawl.dev", options).await; match scrape_result { Ok(data) => println!("Scrape Result:\n{}", data.markdown.unwrap()), Err(e) => eprintln!("Map failed: {}", e), } } ``` ```bash cURL theme={null} # No API key needed to get started — add -H "Authorization: Bearer YOUR_API_KEY" for higher rate limits: curl -X POST https://api.firecrawl.dev/v1/scrape \ -H 'Content-Type: application/json' \ -d '{ "url": "https://docs.firecrawl.dev", "formats" : ["markdown", "html"] }' ``` ### Response SDKs will return the data object directly. cURL will return the payload exactly as shown below. ```json theme={null} { "success": true, "data" : { "markdown": "Launch Week I is here! [See our Day 2 Release 🚀](https://www.firecrawl.dev/blog/launch-week-i-day-2-doubled-rate-limits)[💥 Get 2 months free...", "html": " ```python Python theme={null} from firecrawl import FirecrawlApp app = FirecrawlApp(api_key="fc-YOUR_API_KEY") # Map a website: map_result = app.map_url('https://firecrawl.dev') print(map_result) ``` ```js Node theme={null} import { Firecrawl, MapResponse } from 'firecrawl'; const app = new Firecrawl({apiKey: "fc-YOUR_API_KEY"}); const mapResult = await app.mapUrl('https://firecrawl.dev') as MapResponse; if (!mapResult.success) { throw new Error(`Failed to map: ${mapResult.error}`) } console.log(mapResult) ``` ```go Go theme={null} import ( "fmt" "log" "github.com/mendableai/firecrawl-go" ) func main() { // Initialize the FirecrawlApp with your API key apiKey := "fc-YOUR_API_KEY" apiUrl := "https://api.firecrawl.dev" version := "v1" app, err := firecrawl.NewFirecrawlApp(apiKey, apiUrl, version) if err != nil { log.Fatalf("Failed to initialize FirecrawlApp: %v", err) } // Map a website mapResult, err := app.MapUrl("https://firecrawl.dev", nil) if err != nil { log.Fatalf("Failed to map URL: %v", err) } fmt.Println(mapResult) } ``` ```rust Rust theme={null} use firecrawl::FirecrawlApp; #[tokio::main] async fn main() { // Initialize the FirecrawlApp with the API key let app = FirecrawlApp::new("fc-YOUR_API_KEY").expect("Failed to initialize FirecrawlApp"); let map_result = app.map_url("https://firecrawl.dev", None).await; match map_result { Ok(data) => println!("Mapped URLs: {:#?}", data), Err(e) => eprintln!("Map failed: {}", e), } } ``` ```bash cURL theme={null} curl -X POST https://api.firecrawl.dev/v1/map \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer YOUR_API_KEY' \ -d '{ "url": "https://firecrawl.dev" }' ``` ### Response SDKs will return the data object directly. cURL will return the payload exactly as shown below. ```json theme={null} { "status": "success", "links": [ "https://firecrawl.dev", "https://www.firecrawl.dev/pricing", "https://www.firecrawl.dev/blog", "https://www.firecrawl.dev/playground", "https://www.firecrawl.dev/smart-crawl", ... ] } ``` ## WebSockets To crawl a website with WebSockets, use the `Crawl URL and Watch` method. ```python Python theme={null} # inside an async function... nest_asyncio.apply() # Define event handlers def on_document(detail): print("DOC", detail) def on_error(detail): print("ERR", detail['error']) def on_done(detail): print("DONE", detail['status']) # Function to start the crawl and watch process async def start_crawl_and_watch(): # Initiate the crawl job and get the watcher watcher = app.crawl_url_and_watch('firecrawl.dev', limit=5) # Add event listeners watcher.add_event_listener("document", on_document) watcher.add_event_listener("error", on_error) watcher.add_event_listener("done", on_done) # Start the watcher await watcher.connect() # Run the event loop await start_crawl_and_watch() ``` ```js Node theme={null} const watch = await app.crawlUrlAndWatch('mendable.ai', { excludePaths: ['blog/*'], limit: 5}); watch.addEventListener("document", doc => { console.log("DOC", doc.detail); }); watch.addEventListener("error", err => { console.error("ERR", err.detail.error); }); watch.addEventListener("done", state => { console.log("DONE", state.detail.status); }); ``` ## JSON format LLM extraction is now available in v1 under the `json` format. To extract structured from a page, you can pass a schema to the endpoint or just provide a prompt. ```python Python theme={null} from firecrawl import JsonConfig, FirecrawlApp from pydantic import BaseModel app = FirecrawlApp(api_key="") class ExtractSchema(BaseModel): company_mission: str supports_sso: bool is_open_source: bool is_in_yc: bool json_config = JsonConfig( schema=ExtractSchema ) llm_extraction_result = app.scrape_url( 'https://firecrawl.dev', formats=["json"], json_options=json_config, only_main_content=False, timeout=120000 ) print(llm_extraction_result.json) ``` ```js Node theme={null} import { Firecrawl } from "firecrawl"; import { z } from "zod"; const app = new Firecrawl({ apiKey: "fc-YOUR_API_KEY" }); // Define schema to extract contents into const schema = z.object({ company_mission: z.string(), supports_sso: z.boolean(), is_open_source: z.boolean(), is_in_yc: z.boolean() }); const scrapeResult = await app.scrapeUrl("https://docs.firecrawl.dev/", { formats: ["json"], jsonOptions: { schema: schema } }); if (!scrapeResult.success) { throw new Error(`Failed to scrape: ${scrapeResult.error}`) } console.log(scrapeResult.json); ``` ```bash cURL theme={null} curl -X POST https://api.firecrawl.dev/v1/scrape \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer YOUR_API_KEY' \ -d '{ "url": "https://docs.firecrawl.dev/", "formats": ["json"], "jsonOptions": { "schema": { "type": "object", "properties": { "company_mission": { "type": "string" }, "supports_sso": { "type": "boolean" }, "is_open_source": { "type": "boolean" }, "is_in_yc": { "type": "boolean" } }, "required": [ "company_mission", "supports_sso", "is_open_source", "is_in_yc" ] } } }' ``` Output: ```json JSON theme={null} { "success": true, "data": { "json": { "company_mission": "AI-powered web scraping and data extraction", "supports_sso": true, "is_open_source": true, "is_in_yc": true }, "metadata": { "title": "Firecrawl", "description": "AI-powered web scraping and data extraction", "robots": "follow, index", "ogTitle": "Firecrawl", "ogDescription": "AI-powered web scraping and data extraction", "ogUrl": "https://firecrawl.dev/", "ogImage": "https://firecrawl.dev/og.png", "ogLocaleAlternate": [], "ogSiteName": "Firecrawl", "sourceURL": "https://firecrawl.dev/" }, } } ``` ### Extracting without schema (New) You can now extract without a schema by just passing a `prompt` to the endpoint. The llm chooses the structure of the data. ```bash cURL theme={null} curl -X POST https://api.firecrawl.dev/v1/scrape \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer YOUR_API_KEY' \ -d '{ "url": "https://docs.firecrawl.dev/", "formats": ["json"], "jsonOptions": { "prompt": "Extract the company mission from the page." } }' ``` Output: ```json JSON theme={null} { "success": true, "data": { "json": { "company_mission": "AI-powered web scraping and data extraction", }, "metadata": { "title": "Firecrawl", "description": "AI-powered web scraping and data extraction", "robots": "follow, index", "ogTitle": "Firecrawl", "ogDescription": "AI-powered web scraping and data extraction", "ogUrl": "https://firecrawl.dev/", "ogImage": "https://firecrawl.dev/og.png", "ogLocaleAlternate": [], "ogSiteName": "Firecrawl", "sourceURL": "https://firecrawl.dev/" }, } } ``` ## New Crawl Webhook You can now pass a `webhook` parameter to the `/crawl` endpoint. This will send a POST request to the URL you specify when the crawl is started, updated and completed. The webhook will now trigger for every page crawled and not just the whole result at the end. ```bash cURL theme={null} curl -X POST https://api.firecrawl.dev/v2/crawl \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer YOUR_API_KEY' \ -d '{ "url": "https://docs.firecrawl.dev", "limit": 100, "webhook": { "url": "https://your-domain.com/webhook", "metadata": { "any_key": "any_value" }, "events": ["started", "page", "completed"] } }' ``` ### Webhook Events There are now 4 types of events: * `crawl.started` - Triggered when the crawl is started. * `crawl.page` - Triggered for every page crawled. * `crawl.completed` - Triggered when the crawl is completed to let you know it's done. * `crawl.failed` - Triggered when the crawl fails. ### Webhook Response * `success` - If the webhook was successful in crawling the page correctly. * `type` - The type of event that occurred. * `id` - The ID of the crawl. * `data` - The data that was scraped (Array). This will only be non empty on `crawl.page` and will contain 1 item if the page was scraped successfully. The response is the same as the `/scrape` endpoint. * `error` - If the webhook failed, this will contain the error message. ## Migrating from V0 > ⚠️ **Deprecation Notice**: V0 endpoints will be deprecated on April 1st, 2025. Please migrate to V1 endpoints before then to ensure uninterrupted service. ## /scrape endpoint The updated `/scrape` endpoint has been redesigned for enhanced reliability and ease of use. The structure of the new `/scrape` request body is as follows: ```json theme={null} { "url": "", "formats": ["markdown", "html", "rawHtml", "links", "screenshot", "json"], "includeTags": [""], "excludeTags": [""], "headers": { "": "" }, "waitFor": 123, "timeout": 123 } ``` ### Formats You can now choose what formats you want your output in. You can specify multiple output formats. Supported formats are: * Markdown (markdown) * HTML (html) * Raw HTML (rawHtml) (with no modifications) * Screenshot (screenshot or screenshot\@fullPage) * Links (links) * JSON (json) By default, the output will be include only the markdown format. ### Details on the new request body The table below outlines the changes to the request body parameters for the `/scrape` endpoint in V1. | Parameter | Change | Description | | ---------------------------------- | ----------------- | ----------------------------------------------------------------------------------------------------- | | `onlyIncludeTags` | Moved and Renamed | Moved to root level. And renamed to `includeTags`. | | `removeTags` | Moved and Renamed | Moved to root level. And renamed to `excludeTags`. | | `onlyMainContent` | Moved | Moved to root level. `true` by default. | | `waitFor` | Moved | Moved to root level. | | `headers` | Moved | Moved to root level. | | `parsePDF` | Moved | Moved to root level. | | `extractorOptions` | No Change | | | `timeout` | No Change | | | `pageOptions` | Removed | No need for `pageOptions` parameter. The scrape options were moved to root level. | | `replaceAllPathsWithAbsolutePaths` | Removed | `replaceAllPathsWithAbsolutePaths` is not needed anymore. Every path is now default to absolute path. | | `includeHtml` | Removed | add `"html"` to `formats` instead. | | `includeRawHtml` | Removed | add `"rawHtml"` to `formats` instead. | | `screenshot` | Removed | add `"screenshot"` to `formats` instead. | | `fullPageScreenshot` | Removed | add `"screenshot@fullPage"` to `formats` instead. | | `extractorOptions` | Removed | Use `"json"` format instead with `jsonOptions` object. | The new `json` format is described in the [llm-extract](/features/extract) section. ## /crawl endpoint We've also updated the `/crawl` endpoint on `v1`. Check out the improved body request below: ```json theme={null} { "url": "", "excludePaths": [""], "includePaths": [""], "maxDepth": 2, "ignoreSitemap": true, "limit": 10, "allowBackwardLinks": true, "allowExternalLinks": true, "scrapeOptions": { // same options as in /scrape "formats": ["markdown", "html", "rawHtml", "screenshot", "links"], "headers": { "": "" }, "includeTags": [""], "excludeTags": [""], "onlyMainContent": true, "waitFor": 123 } } ``` ### Details on the new request body The table below outlines the changes to the request body parameters for the `/crawl` endpoint in V1. | Parameter | Change | Description | | ----------------------- | ----------------- | ----------------------------------------------------------------------------------- | | `pageOptions` | Renamed | Renamed to `scrapeOptions`. | | `includes` | Moved and Renamed | Moved to root level. Renamed to `includePaths`. | | `excludes` | Moved and Renamed | Moved to root level. Renamed to `excludePaths`. | | `allowBackwardCrawling` | Moved and Renamed | Moved to root level. Renamed to `allowBackwardLinks`. | | `allowExternalLinks` | Moved | Moved to root level. | | `maxDepth` | Moved | Moved to root level. | | `ignoreSitemap` | Moved | Moved to root level. | | `limit` | Moved | Moved to root level. | | `crawlerOptions` | Removed | No need for `crawlerOptions` parameter. The crawl options were moved to root level. | | `timeout` | Removed | Use `timeout` in `scrapeOptions` instead. |