Python Source of Truth - Firecrawl Docs

Canonical Firecrawl Python source of truth for agents. Generated from SDK source (firecrawl-py / firecrawl 4.22.1) and the v2 OpenAPI spec. Method names, parameters, and return types match the v2 client in firecrawl/v2/client.py unless noted.

Install

pip install firecrawl-py

Authenticate

import os
from firecrawl import Firecrawl

client = Firecrawl(api_key=os.environ.get("FIRECRAWL_API_KEY"))
# client = Firecrawl(api_key="fc-...", api_url="https://api.firecrawl.dev")

When To Use What

search: use when you start with a query and need discovery.
scrape: use when you already have a URL and want page content.
interact: use when the page needs clicks, forms, or post-scrape browser actions.
support/ask: use when a Firecrawl API call fails or returns unexpected results and you need a diagnosis.
support/docs-search: use when you need to look up Firecrawl documentation.

Search

Why use it

Use search to discover relevant pages from a query, then pick URLs to scrape or interact with. You can constrain results to a site with site:, for example site:docs.firecrawl.dev crawl webhooks.

Preferred SDK method

client.search(query, **options) → SearchData (firecrawl.v2.types)

Return value

Returns a SearchData model with optional lists:

web — web hits (SearchResultWeb or full Document when scrape_options hydrates content)
news — news hits (SearchResultNews or Document)
images — image hits (SearchResultImages or Document)

Omitted buckets are None when the API did not return that key. Wrong turn to avoid: search() does not return { data: [...] }. Do not access result.data. Web results are in result.web, news in result.news, images in result.images.

Simple Example

results = client.search("site:docs.firecrawl.dev webhook retries")
for item in results.web or []:
    print(getattr(item, "url", None), getattr(item, "title", None))

Complex Example

from firecrawl.v2.types import ScrapeOptions

results = client.search(
    "site:docs.firecrawl.dev crawl webhooks",
    sources=["web", "news"],
    categories=["research"],
    limit=10,
    tbs="qdr:m",
    location="San Francisco,California,United States",
    ignore_invalid_urls=True,
    timeout=300000,
    scrape_options=ScrapeOptions(
        formats=[
            "markdown",
            "links",
            {"type": "json", "prompt": "Extract title and key endpoints."},
        ],
        only_main_content=True,
        include_tags=["main", "article"],
        exclude_tags=["nav", "footer"],
        wait_for=1000,
        actions=[
            {"type": "click", "selector": "button#accept"},
            {"type": "wait", "milliseconds": 750},
            {"type": "scrape"},
        ],
    ),
)

Parameters

query
- Type: str
- Use when: you need a search query.
- Notes: use site:example.com to limit results to a domain.
sources
- Type: list of source names or Source objects
- Use when: you want to control which sources are searched.
- Confirmed values:
  - "web": web index results
  - "news": news results
  - "images": image results
  - Source(type="web" | "news" | "images"): typed source object form
categories
- Type: list of category names or Category objects
- Use when: you want to filter results by category.
- Confirmed values:
  - "github": GitHub-focused results
  - "research": research and academic results
  - "pdf": PDF-focused results
  - Category(type="github" | "research" | "pdf"): typed category object form
limit
- Type: int
- Use when: you want to cap results.
- Notes: defaults to 5 in the SDK model.
tbs
- Type: str
- Use when: you need a time-based filter (for example qdr:d, qdr:w, sbd:1,qdr:m).
location
- Type: str
- Use when: you want localized results.
ignore_invalid_urls
- Type: bool
- Use when: you want to drop URLs that cannot be scraped by other endpoints.
timeout
- Type: int
- Use when: you need a request timeout in milliseconds.
- Notes: defaults to 300000 in the SDK model.
scrape_options
- Type: ScrapeOptions
- Use when: you want to scrape each search result (see Scrape parameters for fields).

Scrape

Why use it

Use scrape when you already have a URL and want structured content in one or more formats.

Preferred SDK method

client.scrape(url, **options) → Document (firecrawl.v2.types)

Simple Example

doc = client.scrape("https://docs.firecrawl.dev", formats=["markdown"])

Complex Example

doc = client.scrape(
    "https://example.com/pricing",
    formats=[
        "markdown",
        "links",
        {"type": "json", "prompt": "Extract plan names and prices."},
        {"type": "screenshot", "full_page": True, "quality": 80, "viewport": {"width": 1280, "height": 720}},
        {"type": "changeTracking", "modes": ["git-diff"], "tag": "pricing"},
        {"type": "attributes", "selectors": [{"selector": "a", "attribute": "href"}]},
    ],
    headers={"User-Agent": "FirecrawlDocsBot/1.0"},
    only_main_content=True,
    wait_for=1000,
    parsers=[{"type": "pdf", "mode": "auto", "max_pages": 5}],
    actions=[
        {"type": "click", "selector": "#accept"},
        {"type": "wait", "milliseconds": 750},
        {"type": "scrape"},
    ],
    location={"country": "US", "languages": ["en-US"]},
    remove_base64_images=True,
    fast_mode=True,
    block_ads=True,
    proxy="auto",
    max_age=86400000,
    store_in_cache=True,
    profile={"name": "docs-session", "save_changes": True},
)

Parameters

url
- Type: str
- Use when: you want to scrape a specific page.
formats
- Type: list of format strings or format objects
- Use when: you want multiple output formats.
- Confirmed format strings:
  - "markdown": markdown content
  - "html": cleaned HTML
  - "rawHtml" or "raw_html": raw HTML
  - "links": page links
  - "images": image URLs
  - "screenshot": screenshot output
  - "summary": summary output
  - "changeTracking" or "change_tracking": change tracking output
  - "attributes": attribute extraction
  - "branding": branding profile output
  - "audio": audio extraction
  - "video": video extraction
- Object-only format types:
  - {"type": "json", ...}: JSON extraction. Use an object, not the plain string "json".
  - {"type": "question", "question": "..."}: question-answer output.
  - {"type": "highlights", "query": "..."}: relevant source-text output.
- Format object fields:
  - type: one of the format strings above, or "json", "question", or "highlights" for object-only formats
  - question: for type: "question"
  - query: for type: "highlights"
  - prompt: optional for type: "json"
  - schema: JSON schema for type: "json" or for change tracking JSON mode
  - modes: array of "git-diff" or "json" for type: "changeTracking"
  - tag: change tracking tag for type: "changeTracking"
  - full_page, quality, viewport: screenshot options for type: "screenshot"
  - selectors: array of {selector, attribute} for type: "attributes"
headers
- Type: dict of str to str
- Use when: you need custom request headers.
include_tags
- Type: list of str
- Use when: you want to include only specific HTML tags.
exclude_tags
- Type: list of str
- Use when: you want to exclude specific HTML tags.
only_main_content
- Type: bool
- Use when: you want to strip nav, footer, and other boilerplate.
timeout
- Type: int
- Use when: you need a timeout in milliseconds.
wait_for
- Type: int
- Use when: you need to wait for the page to render (milliseconds).
mobile
- Type: bool
- Use when: you want a mobile viewport.
parsers
- Type: list of parser names or parser objects
- Use when: you need file parsing controls (for example PDF parsing).
- Confirmed values:
  - "pdf": enable PDF parsing
  - { "type": "pdf", "mode": "fast" | "auto" | "ocr", "max_pages": int }: PDF parser options
actions
- Type: list of action objects
- Use when: you need lightweight pre-scrape actions.
- Confirmed action types:
  - wait: milliseconds or selector required
  - screenshot: full_page, quality, viewport optional
  - click: selector required
  - write: text required (click to focus the input first)
  - press: key required
  - scroll: direction (up or down) required, selector optional
  - scrape: no additional fields
  - executeJavascript: script required
  - pdf: format (A0, A1, A2, A3, A4, A5, A6, Letter, Legal, Tabloid, Ledger), landscape, scale optional
location
- Type: dict with country and languages
- Use when: you need geo or language-aware scraping.
skip_tls_verification
- Type: bool
- Use when: you need to skip TLS verification.
remove_base64_images
- Type: bool
- Use when: you want to drop base64 images from markdown output.
fast_mode
- Type: bool
- Use when: you want faster scrapes with reduced fidelity.
block_ads
- Type: bool
- Use when: you want ad and cookie popup blocking.
proxy
- Type: str
- Use when: you need proxy control.
- Confirmed values: "basic", "stealth", "enhanced", "auto"
max_age
- Type: int
- Use when: you want cached data up to a maximum age (milliseconds).
min_age
- Type: int
- Use when: you want cached data only if it is at least this old (milliseconds).
- Notes: only on ScrapeOptions passed as scrape_options to client.search. The client.scrape method does not accept min_age.
store_in_cache
- Type: bool
- Use when: you want Firecrawl to cache the result.
profile
- Type: dict with name and optional save_changes or saveChanges
- Use when: you want a persistent browser profile shared across scrapes and interactions.

Interact

Why use it

Use interact when a page requires browser actions or code execution after a scrape starts.

Preferred SDK method

client.interact(job_id, code=None, *, prompt=None, language="node", timeout=None) prompt is keyword-only: call client.interact(job_id, prompt="...") or client.interact(job_id, code="..."). At least one of code or prompt must be non-empty (validated in firecrawl/v2/methods/scrape.py).

Simple Example

doc = client.scrape("https://example.com", formats=["markdown"])
job_id = doc.metadata.scrape_id if doc.metadata else None
if not job_id:
    raise RuntimeError("Missing scrape_id from scrape response")

result = client.interact(job_id, prompt="Click the pricing tab and summarize the plans.")

Complex Example

result = client.interact(
    "<scrapeJobId>",
    code="print(await page.title())",
    language="python",
    timeout=60,
)

Stop session

client.stop_interaction(job_id) ends the scrape-bound browser session. Returns BrowserDeleteResponse (success, optional session_duration_ms, credits_billed, error).

Parameters

job_id
- Type: str
- Use when: you have a scrape job ID from document.metadata.scrape_id.
code
- Type: str
- Use when: you want to run code in the browser session (optional if prompt is set).
prompt
- Type: str
- Use when: you want the browser agent to follow a natural-language instruction (optional if code is set).
language
- Type: str
- Use when: you need a specific runtime.
- Confirmed values: "python", "node", "bash"
- Notes: defaults to "node" in the SDK.
timeout
- Type: int
- Use when: you need an execution timeout in seconds.

Return value

Returns BrowserExecuteResponse: success, optional live_view_url, interactive_live_view_url, output, stdout, result, stderr, exit_code, killed, error (API camelCase is normalized to snake_case on the model).

Ask (Agentic Debugging)

Why use it

Use ask when a Firecrawl API call fails or returns unexpected results. The AI support agent diagnoses the issue, proposes fix parameters, and optionally validates the fix against the live API. Typical latency: 15-30 seconds.

Preferred SDK method

Not yet available in the Python SDK. Use requests or httpx directly.

Simple Example

import requests

response = requests.post(
    "https://api.firecrawl.dev/v2/support/ask",
    headers={
        "Authorization": f"Bearer {os.environ['FIRECRAWL_API_KEY']}",
        "Content-Type": "application/json",
    },
    json={"question": "my scrape returned empty markdown for https://example.com"},
)
result = response.json()
print(result["answer"])
print(result["fixParameters"])

Complex Example

import requests

response = requests.post(
    "https://api.firecrawl.dev/v2/support/ask",
    headers={
        "Authorization": f"Bearer {os.environ['FIRECRAWL_API_KEY']}",
        "Content-Type": "application/json",
    },
    json={
        "question": "crawl returned 3 pages but I expected 50",
        "rationale": "user is on their third failed crawl attempt today",
        "context": {"userPlan": "standard", "retryCount": 3},
    },
)
result = response.json()

Agent retry pattern

from firecrawl import Firecrawl
import requests, os

client = Firecrawl(api_key=os.environ["FIRECRAWL_API_KEY"])

doc = client.scrape("https://example.com/pricing", formats=["markdown"])

if not doc.markdown or len(doc.markdown) < 100:
    diagnosis = requests.post(
        "https://api.firecrawl.dev/v2/support/ask",
        headers={
            "Authorization": f"Bearer {os.environ['FIRECRAWL_API_KEY']}",
            "Content-Type": "application/json",
        },
        json={
            "question": f"scrape returned only {len(doc.markdown or '')} chars for https://example.com/pricing",
        },
    ).json()

    if diagnosis.get("fixParameters"):
        doc = client.scrape(
            "https://example.com/pricing",
            formats=["markdown"],
            **diagnosis["fixParameters"],
        )

Parameters

question
- Type: str (required, 1-8000 chars)
- Use when: you need to describe the issue.
rationale
- Type: str (1-2000 chars)
- Use when: you are an AI agent calling on behalf of a user.
context
- Type: dict (free-form)
- Use when: you want to pass metadata into the debugging prompt.

Docs Search

Why use it

Use docs-search to look up Firecrawl documentation.

Simple Example

import requests

FIRECRAWL_API_KEY = "fc-YOUR_API_KEY"

response = requests.post(
    "https://api.firecrawl.dev/v2/support/docs-search",
    headers={"Authorization": f"Bearer {FIRECRAWL_API_KEY}"},
    json={"question": "how do I verify webhook signatures?"},
)
print(response.json()["answer"])

Parameters

question
- Type: str (required, 1-8000 chars)
- Use when: you need a docs-grounded answer.

Notes

Deprecated aliases: scrape_execute → interact; stop_interactive_browser and delete_scrape_browser → stop_interaction.
The top-level Firecrawl client exposes v2 methods directly; v1 remains under client.v1.
The bundled v2 OpenAPI snippet for POST /v2/scrape/{jobId}/interact may only document code; the Python SDK and server accept either code or prompt for this endpoint.

Source Of Truth

firecrawl/apps/python-sdk/pyproject.toml
firecrawl/apps/python-sdk/firecrawl/__init__.py
firecrawl/apps/python-sdk/firecrawl/client.py
firecrawl/apps/python-sdk/firecrawl/v2/client.py
firecrawl/apps/python-sdk/firecrawl/v2/types.py
firecrawl/apps/python-sdk/firecrawl/v2/methods/search.py
firecrawl/apps/python-sdk/firecrawl/v2/methods/scrape.py
firecrawl/apps/python-sdk/firecrawl/v2/utils/validation.py
firecrawl-docs/api-reference/v2-openapi.json

​Install

​Authenticate

​When To Use What

​Search

​Why use it

​Preferred SDK method

​Return value

​Simple Example

​Complex Example

​Parameters

​Scrape

​Why use it

​Preferred SDK method

​Simple Example

​Complex Example

​Parameters

​Interact

​Why use it

​Preferred SDK method

​Simple Example

​Complex Example

​Stop session

​Parameters

​Return value

​Ask (Agentic Debugging)

​Why use it

​Preferred SDK method

​Simple Example

​Complex Example

​Agent retry pattern

​Parameters

​Docs Search

​Why use it

​Simple Example

​Parameters

​Notes

​Source Of Truth

Install

Authenticate

When To Use What

Search

Why use it

Preferred SDK method

Return value

Simple Example

Complex Example

Parameters

Scrape

Why use it

Preferred SDK method

Simple Example

Complex Example

Parameters

Interact

Why use it

Preferred SDK method

Simple Example

Complex Example

Stop session

Parameters

Return value

Ask (Agentic Debugging)

Why use it

Preferred SDK method

Simple Example

Complex Example

Agent retry pattern

Parameters

Docs Search

Why use it

Simple Example

Parameters

Notes

Source Of Truth