> ## Documentation Index
> Fetch the complete documentation index at: https://docs.firecrawl.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Java Source of Truth

> Canonical Firecrawl Java source of truth for agents using key endpoints like search, scrape, and interact.

Canonical Firecrawl Java source of truth for agents. Generated from SDK source and the v2 OpenAPI spec.

## Install

Maven:

```xml theme={null}
<dependency>
  <groupId>com.firecrawl</groupId>
  <artifactId>firecrawl-java</artifactId>
  <version>1.2.0</version>
</dependency>
```

Gradle:

```gradle theme={null}
implementation("com.firecrawl:firecrawl-java:1.2.0")
```

## Authenticate

```java theme={null}
import com.firecrawl.client.FirecrawlClient;

FirecrawlClient client = FirecrawlClient.builder()
    .apiKey(System.getenv("FIRECRAWL_API_KEY"))
    .build();
```

## When To Use What

* `search`: use when you start with a query and need discovery.
* `scrape`: use when you already have a URL and want page content.
* `interact`: use when the page needs clicks, forms, or post-scrape browser actions.
* `support/ask`: use when a Firecrawl API call fails or returns unexpected results and you need a diagnosis.
* `support/docs-search`: use when you need to look up Firecrawl documentation.

## Search

### Why use it

Use search to discover relevant pages from a query, then pick URLs to scrape or interact with. You can constrain results to a site with `site:`, for example `site:docs.firecrawl.dev crawl webhooks`.

### Preferred SDK methods

* `client.search(query)`
* `client.search(query, options)`

### Return value

`search` returns `SearchData`. Read result buckets with `getWeb()`, `getNews()`, and `getImages()` (each is `List<Map<String, Object>>` and may be null). Do not treat `SearchData` as a directly iterable list of hits.

### Simple Example

```java theme={null}
import com.firecrawl.models.SearchData;
import java.util.List;
import java.util.Map;

SearchData results = client.search("site:docs.firecrawl.dev webhook retries");
List<Map<String, Object>> web = results.getWeb();
```

### Complex Example

```java theme={null}
import com.firecrawl.models.SearchOptions;
import com.firecrawl.models.ScrapeOptions;
import com.firecrawl.models.JsonFormat;
import com.firecrawl.models.LocationConfig;

SearchOptions options = SearchOptions.builder()
    .sources(List.of("web", "news"))
    .categories(List.of("research"))
    .limit(10)
    .tbs("qdr:m")
    .location("San Francisco,California,United States")
    .ignoreInvalidURLs(true)
    .timeout(60000)
    .scrapeOptions(
        ScrapeOptions.builder()
            .formats(List.of(
                "markdown",
                "links",
                JsonFormat.builder().prompt("Extract title and key endpoints.").build()
            ))
            .onlyMainContent(true)
            .waitFor(1000)
            .build()
    )
    .build();

SearchData results = client.search("site:docs.firecrawl.dev crawl webhooks", options);
```

### Parameters

* `query`
  * Type: String
  * Use when: you need a search query.
  * Notes: use `site:example.com` to limit results to a domain.

* `options.sources`
  * Type: List of source strings or maps
  * Use when: you want to control which sources are searched.
  * Confirmed values:
    * `"web"`: web index results
    * `"news"`: news results
    * `"images"`: image results
    * `{type: "web" | "news" | "images"}`: typed source map form

* `options.categories`
  * Type: List of category strings or maps
  * Use when: you want to filter results by category.
  * Confirmed values:
    * `"github"`: GitHub-focused results
    * `"research"`: research and academic results
    * `"pdf"`: PDF-focused results
    * `{type: "github" | "research" | "pdf"}`: typed category map form

* `options.limit`
  * Type: Integer
  * Use when: you want to cap results.

* `options.tbs`
  * Type: String
  * Use when: you need a time-based filter (for example `qdr:d`, `qdr:w`, `sbd:1,qdr:m`).

* `options.location`
  * Type: String
  * Use when: you want localized results.

* `options.ignoreInvalidURLs`
  * Type: Boolean
  * Use when: you want to drop URLs that cannot be scraped by other endpoints.

* `options.timeout`
  * Type: Integer
  * Use when: you need a request timeout in milliseconds.

* `options.scrapeOptions`
  * Type: `ScrapeOptions`
  * Use when: you want to scrape each search result (see Scrape parameters for fields).

* `options.integration`
  * Type: String
  * Use when: the API expects an integration identifier on the request.

## Scrape

### Why use it

Use scrape when you already have a URL and want structured content in one or more formats.

### Preferred SDK methods

* `client.scrape(url)`
* `client.scrape(url, options)`

### Return value

`scrape` returns `Document`. Typical getters include `getMarkdown()`, `getHtml()`, `getRawHtml()`, `getJson()`, `getMetadata()`, `getLinks()`, `getAudio()`, `getVideo()`, and additional fields when the corresponding formats are requested.

### Simple Example

```java theme={null}
Document doc = client.scrape(
    "https://docs.firecrawl.dev",
    ScrapeOptions.builder().formats(List.of("markdown")).build()
);
```

### Complex Example

```java theme={null}
import com.firecrawl.models.ScrapeOptions;
import com.firecrawl.models.JsonFormat;

List<Map<String, Object>> actions = List.of(
    Map.of("type", "click", "selector", "#accept"),
    Map.of("type", "wait", "milliseconds", 750),
    Map.of("type", "scrape")
);

LocationConfig location = LocationConfig.builder()
    .country("US")
    .languages(List.of("en-US"))
    .build();

ScrapeOptions options = ScrapeOptions.builder()
    .formats(List.of(
        "markdown",
        "links",
        JsonFormat.builder().prompt("Extract plan names and prices.").build(),
        Map.of("type", "screenshot", "fullPage", true, "quality", 80)
    ))
    .onlyMainContent(true)
    .waitFor(1000)
    .parsers(List.of(Map.of("type", "pdf", "maxPages", 5)))
    .actions(actions)
    .location(location)
    .removeBase64Images(true)
    .blockAds(true)
    .proxy("auto")
    .maxAge(86400000L)
    .storeInCache(true)
    .build();

Document doc = client.scrape("https://example.com/pricing", options);
```

### Parameters

* `url`
  * Type: String
  * Use when: you want to scrape a specific page.

* `options.formats`
  * Type: List of format strings or format maps
  * Use when: you want multiple output formats.
  * Confirmed format strings:
    * `"markdown"`: markdown content
    * `"html"`: cleaned HTML
    * `"rawHtml"`: raw HTML
    * `"links"`: page links
    * `"images"`: image URLs
    * `"screenshot"`: screenshot output
    * `"summary"`: summary output
    * `"changeTracking"`: change tracking output
    * `"json"`: JSON extraction
    * `"attributes"`: attribute extraction
    * `"branding"`: branding profile output
    * `"audio"`: audio extraction
    * `"video"`: video extraction
  * Format object fields:
    * `type`: one of the format strings above
    * `prompt`, `schema`: JSON extraction options for `type: "json"`
    * `modes`, `schema`, `prompt`, `tag`: change tracking options for `type: "changeTracking"`
    * `fullPage`, `quality`, `viewport`: screenshot options for `type: "screenshot"`
    * `selectors`: array of `{selector, attribute}` for `type: "attributes"`

* `options.headers`
  * Type: `Map<String, String>`
  * Use when: you need custom request headers.

* `options.includeTags`
  * Type: `List<String>`
  * Use when: you want to include only specific HTML tags.

* `options.excludeTags`
  * Type: `List<String>`
  * Use when: you want to exclude specific HTML tags.

* `options.onlyMainContent`
  * Type: Boolean
  * Use when: you want to strip nav, footer, and other boilerplate.

* `options.timeout`
  * Type: Integer
  * Use when: you need a timeout in milliseconds.

* `options.waitFor`
  * Type: Integer
  * Use when: you need to wait for the page to render (milliseconds).

* `options.mobile`
  * Type: Boolean
  * Use when: you want a mobile viewport.

* `options.parsers`
  * Type: `List<Object>`
  * Use when: you need file parsing controls.
  * Confirmed values:
    * `"pdf"`
    * `{type: "pdf", maxPages: number}`

* `options.actions`
  * Type: `List<Map<String, Object>>`
  * Use when: you need lightweight pre-scrape actions.
  * Confirmed action types:
    * `wait`: `milliseconds` or `selector` required
    * `screenshot`: `fullPage`, `quality`, `viewport` optional
    * `click`: `selector` required
    * `write`: `text` required (click to focus the input first)
    * `press`: `key` required
    * `scroll`: `direction` (`up` or `down`) required, `selector` optional
    * `scrape`: no additional fields
    * `executeJavascript`: `script` required
    * `pdf`: `format` (A0, A1, A2, A3, A4, A5, A6, Letter, Legal, Tabloid, Ledger), `landscape`, `scale` optional

* `options.location`
  * Type: `LocationConfig`
  * Use when: you need geo or language-aware scraping.

* `options.skipTlsVerification`
  * Type: Boolean
  * Use when: you need to skip TLS verification.

* `options.removeBase64Images`
  * Type: Boolean
  * Use when: you want to drop base64 images from markdown output.

* `options.blockAds`
  * Type: Boolean
  * Use when: you want ad and cookie popup blocking.

* `options.proxy`
  * Type: String
  * Use when: you need proxy control.
  * Confirmed values: `"basic"`, `"stealth"`, `"enhanced"`, `"auto"`, or a custom proxy URL string

* `options.maxAge`
  * Type: Long
  * Use when: you want cached data up to a maximum age (milliseconds).

* `options.storeInCache`
  * Type: Boolean
  * Use when: you want Firecrawl to cache the result.

* `options.integration`
  * Type: String
  * Use when: the API expects an integration identifier on the request.

## Interact

### Why use it

Use interact when a page requires browser actions or code execution after a scrape starts.

### Preferred SDK methods

* `client.interact(jobId, code)` — uses default language `node` and API default execution timeout
* `client.interact(jobId, code, language, timeout)` — `timeout` is seconds (1–300), or null to omit and use the API default (30 seconds)
* `client.interact(jobId, code, language, timeout, origin)` — optional `origin` string is sent only when non-null (request attribution)

### Simple Example

```java theme={null}
import com.firecrawl.models.BrowserExecuteResponse;

BrowserExecuteResponse result = client.interact(
    "<scrapeJobId>",
    "console.log(await page.title());",
    "node",
    60
);
```

### Complex Example

```java theme={null}
BrowserExecuteResponse result = client.interact(
    "<scrapeJobId>",
    "// Use Playwright page methods here",
    "node",
    120
);
```

### Stop interactive browser

End the scrape-bound browser session when finished.

**Preferred SDK method:** `client.stopInteractiveBrowser(jobId)`

```java theme={null}
import com.firecrawl.models.BrowserDeleteResponse;

BrowserDeleteResponse stopped = client.stopInteractiveBrowser("<scrapeJobId>");
```

### Interact response (`BrowserExecuteResponse`)

* `isSuccess()`: boolean
* `getStdout()`, `getStderr()`, `getResult()`, `getError()`: String (may be null)
* `getExitCode()`: Integer (may be null)
* `getKilled()`: Boolean (may be null) — true when execution was stopped due to timeout

### Stop response (`BrowserDeleteResponse`)

* `isSuccess()`: boolean
* `getSessionDurationMs()`: Long (may be null)
* `getCreditsBilled()`: Integer (may be null)
* `getError()`: String (may be null)

### Parameters

* `jobId`
  * Type: String
  * Use when: you have a scrape job ID.

* `code`
  * Type: String
  * Use when: you want to run code in the browser session.

* `language`
  * Type: String
  * Use when: you need a specific runtime.
  * Confirmed values: `"python"`, `"node"`, `"bash"`
  * Notes: defaults to `"node"` in the SDK if null.

* `timeout`
  * Type: Integer
  * Use when: you need an execution timeout in seconds (1–300). Null omits the field and uses the API default.

* `origin`
  * Type: String
  * Use when: you need an optional origin label on the request. Prefer omitting unless your integration requires it.

## Notes

* Deprecated aliases: `scrapeExecute` → `interact`, `deleteScrapeBrowser` → `stopInteractiveBrowser` (and the corresponding `*Async` helpers).
* The Java SDK exposes code-based interactions only: there is no `prompt` parameter on `interact` (unlike some other language SDKs).

## Source Of Truth

* `firecrawl/apps/java-sdk/build.gradle.kts`
* `firecrawl/apps/java-sdk/src/main/java/com/firecrawl/client/FirecrawlClient.java`
* `firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/SearchOptions.java`
* `firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/ScrapeOptions.java`
* `firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/SearchData.java`
* `firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/JsonFormat.java`
* `firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/LocationConfig.java`
* `firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/Document.java`
* `firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/BrowserExecuteResponse.java`
* `firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/BrowserDeleteResponse.java`
* `firecrawl-docs/api-reference/v2-openapi.json`
