Scraping with Firecrawl

Firecrawl converts web pages into markdown, ideal for LLM applications. Here’s why:

  1. Complexities Managed: Handles proxies, caching, rate limits, and JavaScript-blocked content for smooth scraping.

  2. Dynamic Content: Gathers data from JavaScript-rendered websites, pdfs, images etc.

  3. Markdown or Structured data conversion: Converts collected data into clean markdown or structured output, perfect for LLM processing or any other task.

For more details, refer to the Scrape Endpoint API Reference.

Scrape a URL

/scrape endpoint

Used to scrape a URL and get its content.

Installation

pip install firecrawl-py

Usage

from firecrawl import FirecrawlApp

app = FirecrawlApp(api_key="YOUR_API_KEY")

content = app.scrape_url("https://mendable.ai")

Response

SDKs will return the data object directly. cURL will return they payload exactly as shown below

{
  "success": true,
  "data": {
    "content": "Raw Content ",
    "markdown": "# Markdown Content",
    "provider": "web-scraper",
    "metadata": {
      "title": "Mendable | AI for CX and Sales",
      "description": "AI for CX and Sales",
      "language": null,
      "sourceURL": "https://www.mendable.ai/"
    }
  }
}