> ## Documentation Index
> Fetch the complete documentation index at: https://docs.firecrawl.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Python Source of Truth

> Canonical Firecrawl Python source of truth for agents using key endpoints like search, scrape, and interact.

Canonical Firecrawl Python source of truth for agents. Generated from SDK source (`firecrawl-py` / `firecrawl` **4.22.1**) and the v2 OpenAPI spec. Method names, parameters, and return types match the v2 client in `firecrawl/v2/client.py` unless noted.

## Install

```bash theme={null}
pip install firecrawl-py
```

## Authenticate

```py theme={null}
import os
from firecrawl import Firecrawl

client = Firecrawl(api_key=os.environ.get("FIRECRAWL_API_KEY"))
# client = Firecrawl(api_key="fc-...", api_url="https://api.firecrawl.dev")
```

## When To Use What

* `search`: use when you start with a query and need discovery.
* `scrape`: use when you already have a URL and want page content.
* `interact`: use when the page needs clicks, forms, or post-scrape browser actions.
* `support/ask`: use when a Firecrawl API call fails or returns unexpected results and you need a diagnosis.
* `support/docs-search`: use when you need to look up Firecrawl documentation.

## Search

### Why use it

Use search to discover relevant pages from a query, then pick URLs to scrape or interact with. You can constrain results to a site with `site:`, for example `site:docs.firecrawl.dev crawl webhooks`.

### Preferred SDK method

`client.search(query, **options)` → `SearchData` (`firecrawl.v2.types`)

### Return value

Returns a `SearchData` model with optional lists:

* `web` — web hits (`SearchResultWeb` or full `Document` when `scrape_options` hydrates content)
* `news` — news hits (`SearchResultNews` or `Document`)
* `images` — image hits (`SearchResultImages` or `Document`)

Omitted buckets are `None` when the API did not return that key.

**Wrong turn to avoid:** `search()` does not return `{ data: [...] }`. Do not access `result.data`. Web results are in `result.web`, news in `result.news`, images in `result.images`.

### Simple Example

```py theme={null}
results = client.search("site:docs.firecrawl.dev webhook retries")
for item in results.web or []:
    print(getattr(item, "url", None), getattr(item, "title", None))
```

### Complex Example

```py theme={null}
from firecrawl.v2.types import ScrapeOptions

results = client.search(
    "site:docs.firecrawl.dev crawl webhooks",
    sources=["web", "news"],
    categories=["research"],
    limit=10,
    tbs="qdr:m",
    location="San Francisco,California,United States",
    ignore_invalid_urls=True,
    timeout=300000,
    scrape_options=ScrapeOptions(
        formats=[
            "markdown",
            "links",
            {"type": "json", "prompt": "Extract title and key endpoints."},
        ],
        only_main_content=True,
        include_tags=["main", "article"],
        exclude_tags=["nav", "footer"],
        wait_for=1000,
        actions=[
            {"type": "click", "selector": "button#accept"},
            {"type": "wait", "milliseconds": 750},
            {"type": "scrape"},
        ],
    ),
)
```

### Parameters

* `query`
  * Type: str
  * Use when: you need a search query.
  * Notes: use `site:example.com` to limit results to a domain.

* `sources`
  * Type: list of source names or `Source` objects
  * Use when: you want to control which sources are searched.
  * Confirmed values:
    * `"web"`: web index results
    * `"news"`: news results
    * `"images"`: image results
    * `Source(type="web" | "news" | "images")`: typed source object form

* `categories`
  * Type: list of category names or `Category` objects
  * Use when: you want to filter results by category.
  * Confirmed values:
    * `"github"`: GitHub-focused results
    * `"research"`: research and academic results
    * `"pdf"`: PDF-focused results
    * `Category(type="github" | "research" | "pdf")`: typed category object form

* `limit`
  * Type: int
  * Use when: you want to cap results.
  * Notes: defaults to `5` in the SDK model.

* `tbs`
  * Type: str
  * Use when: you need a time-based filter (for example `qdr:d`, `qdr:w`, `sbd:1,qdr:m`).

* `location`
  * Type: str
  * Use when: you want localized results.

* `ignore_invalid_urls`
  * Type: bool
  * Use when: you want to drop URLs that cannot be scraped by other endpoints.

* `timeout`
  * Type: int
  * Use when: you need a request timeout in milliseconds.
  * Notes: defaults to `300000` in the SDK model.

* `scrape_options`
  * Type: `ScrapeOptions`
  * Use when: you want to scrape each search result (see Scrape parameters for fields).

## Scrape

### Why use it

Use scrape when you already have a URL and want structured content in one or more formats.

### Preferred SDK method

`client.scrape(url, **options)` → `Document` (`firecrawl.v2.types`)

### Simple Example

```py theme={null}
doc = client.scrape("https://docs.firecrawl.dev", formats=["markdown"])
```

### Complex Example

```py theme={null}
doc = client.scrape(
    "https://example.com/pricing",
    formats=[
        "markdown",
        "links",
        {"type": "json", "prompt": "Extract plan names and prices."},
        {"type": "screenshot", "full_page": True, "quality": 80, "viewport": {"width": 1280, "height": 720}},
        {"type": "changeTracking", "modes": ["git-diff"], "tag": "pricing"},
        {"type": "attributes", "selectors": [{"selector": "a", "attribute": "href"}]},
    ],
    headers={"User-Agent": "FirecrawlDocsBot/1.0"},
    only_main_content=True,
    wait_for=1000,
    parsers=[{"type": "pdf", "mode": "auto", "max_pages": 5}],
    actions=[
        {"type": "click", "selector": "#accept"},
        {"type": "wait", "milliseconds": 750},
        {"type": "scrape"},
    ],
    location={"country": "US", "languages": ["en-US"]},
    remove_base64_images=True,
    fast_mode=True,
    block_ads=True,
    proxy="auto",
    max_age=86400000,
    store_in_cache=True,
    profile={"name": "docs-session", "save_changes": True},
)
```

### Parameters

* `url`
  * Type: str
  * Use when: you want to scrape a specific page.

* `formats`
  * Type: list of format strings or format objects
  * Use when: you want multiple output formats.
  * Confirmed format strings:
    * `"markdown"`: markdown content
    * `"html"`: cleaned HTML
    * `"rawHtml"` or `"raw_html"`: raw HTML
    * `"links"`: page links
    * `"images"`: image URLs
    * `"screenshot"`: screenshot output
    * `"summary"`: summary output
    * `"changeTracking"` or `"change_tracking"`: change tracking output
    * `"attributes"`: attribute extraction
    * `"branding"`: branding profile output
    * `"audio"`: audio extraction
    * `"video"`: video extraction
  * Object-only format types:
    * `{"type": "json", ...}`: JSON extraction. Use an object, not the plain string `"json"`.
    * `{"type": "question", "question": "..."}`: question-answer output.
    * `{"type": "highlights", "query": "..."}`: relevant source-text output.
  * Format object fields:
    * `type`: one of the format strings above, or `"json"`, `"question"`, or `"highlights"` for object-only formats
    * `question`: for `type: "question"`
    * `query`: for `type: "highlights"`
    * `prompt`: optional for `type: "json"`
    * `schema`: JSON schema for `type: "json"` or for change tracking JSON mode
    * `modes`: array of `"git-diff"` or `"json"` for `type: "changeTracking"`
    * `tag`: change tracking tag for `type: "changeTracking"`
    * `full_page`, `quality`, `viewport`: screenshot options for `type: "screenshot"`
    * `selectors`: array of `{selector, attribute}` for `type: "attributes"`

* `headers`
  * Type: dict of str to str
  * Use when: you need custom request headers.

* `include_tags`
  * Type: list of str
  * Use when: you want to include only specific HTML tags.

* `exclude_tags`
  * Type: list of str
  * Use when: you want to exclude specific HTML tags.

* `only_main_content`
  * Type: bool
  * Use when: you want to strip nav, footer, and other boilerplate.

* `timeout`
  * Type: int
  * Use when: you need a timeout in milliseconds.

* `wait_for`
  * Type: int
  * Use when: you need to wait for the page to render (milliseconds).

* `mobile`
  * Type: bool
  * Use when: you want a mobile viewport.

* `parsers`
  * Type: list of parser names or parser objects
  * Use when: you need file parsing controls (for example PDF parsing).
  * Confirmed values:
    * `"pdf"`: enable PDF parsing
    * `{ "type": "pdf", "mode": "fast" | "auto" | "ocr", "max_pages": int }`: PDF parser options

* `actions`
  * Type: list of action objects
  * Use when: you need lightweight pre-scrape actions.
  * Confirmed action types:
    * `wait`: `milliseconds` or `selector` required
    * `screenshot`: `full_page`, `quality`, `viewport` optional
    * `click`: `selector` required
    * `write`: `text` required (click to focus the input first)
    * `press`: `key` required
    * `scroll`: `direction` (`up` or `down`) required, `selector` optional
    * `scrape`: no additional fields
    * `executeJavascript`: `script` required
    * `pdf`: `format` (A0, A1, A2, A3, A4, A5, A6, Letter, Legal, Tabloid, Ledger), `landscape`, `scale` optional

* `location`
  * Type: dict with `country` and `languages`
  * Use when: you need geo or language-aware scraping.

* `skip_tls_verification`
  * Type: bool
  * Use when: you need to skip TLS verification.

* `remove_base64_images`
  * Type: bool
  * Use when: you want to drop base64 images from markdown output.

* `fast_mode`
  * Type: bool
  * Use when: you want faster scrapes with reduced fidelity.

* `block_ads`
  * Type: bool
  * Use when: you want ad and cookie popup blocking.

* `proxy`
  * Type: str
  * Use when: you need proxy control.
  * Confirmed values: `"basic"`, `"stealth"`, `"enhanced"`, `"auto"`

* `max_age`
  * Type: int
  * Use when: you want cached data up to a maximum age (milliseconds).

* `min_age`
  * Type: int
  * Use when: you want cached data only if it is at least this old (milliseconds).
  * Notes: only on `ScrapeOptions` passed as `scrape_options` to `client.search`. The `client.scrape` method does not accept `min_age`.

* `store_in_cache`
  * Type: bool
  * Use when: you want Firecrawl to cache the result.

* `profile`
  * Type: dict with `name` and optional `save_changes` or `saveChanges`
  * Use when: you want a persistent browser profile shared across scrapes and interactions.

## Interact

### Why use it

Use interact when a page requires browser actions or code execution after a scrape starts.

### Preferred SDK method

`client.interact(job_id, code=None, *, prompt=None, language="node", timeout=None)`

`prompt` is keyword-only: call `client.interact(job_id, prompt="...")` or `client.interact(job_id, code="...")`. At least one of `code` or `prompt` must be non-empty (validated in `firecrawl/v2/methods/scrape.py`).

### Simple Example

```py theme={null}
doc = client.scrape("https://example.com", formats=["markdown"])
job_id = doc.metadata.scrape_id if doc.metadata else None
if not job_id:
    raise RuntimeError("Missing scrape_id from scrape response")

result = client.interact(job_id, prompt="Click the pricing tab and summarize the plans.")
```

### Complex Example

```py theme={null}
result = client.interact(
    "<scrapeJobId>",
    code="print(await page.title())",
    language="python",
    timeout=60,
)
```

### Stop session

`client.stop_interaction(job_id)` ends the scrape-bound browser session. Returns `BrowserDeleteResponse` (`success`, optional `session_duration_ms`, `credits_billed`, `error`).

### Parameters

* `job_id`
  * Type: str
  * Use when: you have a scrape job ID from `document.metadata.scrape_id`.

* `code`
  * Type: str
  * Use when: you want to run code in the browser session (optional if `prompt` is set).

* `prompt`
  * Type: str
  * Use when: you want the browser agent to follow a natural-language instruction (optional if `code` is set).

* `language`
  * Type: str
  * Use when: you need a specific runtime.
  * Confirmed values: `"python"`, `"node"`, `"bash"`
  * Notes: defaults to `"node"` in the SDK.

* `timeout`
  * Type: int
  * Use when: you need an execution timeout in seconds.

### Return value

Returns `BrowserExecuteResponse`: `success`, optional `live_view_url`, `interactive_live_view_url`, `output`, `stdout`, `result`, `stderr`, `exit_code`, `killed`, `error` (API camelCase is normalized to snake\_case on the model).

## Ask (Agentic Debugging)

### Why use it

Use ask when a Firecrawl API call fails or returns unexpected results. The AI support agent diagnoses the issue, proposes fix parameters, and optionally validates the fix against the live API. Typical latency: 15-30 seconds.

### Preferred SDK method

Not yet available in the Python SDK. Use `requests` or `httpx` directly.

### Simple Example

```py theme={null}
import requests

response = requests.post(
    "https://api.firecrawl.dev/v2/support/ask",
    headers={
        "Authorization": f"Bearer {os.environ['FIRECRAWL_API_KEY']}",
        "Content-Type": "application/json",
    },
    json={"question": "my scrape returned empty markdown for https://example.com"},
)
result = response.json()
print(result["answer"])
print(result["fixParameters"])
```

### Complex Example

```py theme={null}
import requests

response = requests.post(
    "https://api.firecrawl.dev/v2/support/ask",
    headers={
        "Authorization": f"Bearer {os.environ['FIRECRAWL_API_KEY']}",
        "Content-Type": "application/json",
    },
    json={
        "question": "crawl returned 3 pages but I expected 50",
        "rationale": "user is on their third failed crawl attempt today",
        "context": {"userPlan": "standard", "retryCount": 3},
    },
)
result = response.json()
```

### Agent retry pattern

```py theme={null}
from firecrawl import Firecrawl
import requests, os

client = Firecrawl(api_key=os.environ["FIRECRAWL_API_KEY"])

doc = client.scrape("https://example.com/pricing", formats=["markdown"])

if not doc.markdown or len(doc.markdown) < 100:
    diagnosis = requests.post(
        "https://api.firecrawl.dev/v2/support/ask",
        headers={
            "Authorization": f"Bearer {os.environ['FIRECRAWL_API_KEY']}",
            "Content-Type": "application/json",
        },
        json={
            "question": f"scrape returned only {len(doc.markdown or '')} chars for https://example.com/pricing",
        },
    ).json()

    if diagnosis.get("fixParameters"):
        doc = client.scrape(
            "https://example.com/pricing",
            formats=["markdown"],
            **diagnosis["fixParameters"],
        )
```

### Parameters

* `question`
  * Type: str (required, 1-8000 chars)
  * Use when: you need to describe the issue.

* `rationale`
  * Type: str (1-2000 chars)
  * Use when: you are an AI agent calling on behalf of a user.

* `context`
  * Type: dict (free-form)
  * Use when: you want to pass metadata into the debugging prompt.

## Docs Search

### Why use it

Use docs-search to look up Firecrawl documentation.

### Simple Example

```py theme={null}
import requests

FIRECRAWL_API_KEY = "fc-YOUR_API_KEY"

response = requests.post(
    "https://api.firecrawl.dev/v2/support/docs-search",
    headers={"Authorization": f"Bearer {FIRECRAWL_API_KEY}"},
    json={"question": "how do I verify webhook signatures?"},
)
print(response.json()["answer"])
```

### Parameters

* `question`
  * Type: str (required, 1-8000 chars)
  * Use when: you need a docs-grounded answer.

## Notes

* Deprecated aliases: `scrape_execute` → `interact`; `stop_interactive_browser` and `delete_scrape_browser` → `stop_interaction`.
* The top-level `Firecrawl` client exposes v2 methods directly; v1 remains under `client.v1`.
* The bundled v2 OpenAPI snippet for `POST /v2/scrape/{jobId}/interact` may only document `code`; the Python SDK and server accept either `code` or `prompt` for this endpoint.

## Source Of Truth

* `firecrawl/apps/python-sdk/pyproject.toml`
* `firecrawl/apps/python-sdk/firecrawl/__init__.py`
* `firecrawl/apps/python-sdk/firecrawl/client.py`
* `firecrawl/apps/python-sdk/firecrawl/v2/client.py`
* `firecrawl/apps/python-sdk/firecrawl/v2/types.py`
* `firecrawl/apps/python-sdk/firecrawl/v2/methods/search.py`
* `firecrawl/apps/python-sdk/firecrawl/v2/methods/scrape.py`
* `firecrawl/apps/python-sdk/firecrawl/v2/utils/validation.py`
* `firecrawl-docs/api-reference/v2-openapi.json`
