Skip to main content

Firecrawl Java Source of Truth

This file is the canonical public source of truth for agents. It is generated from the Firecrawl Java SDK source and the v2 OpenAPI spec.

Install

Maven:
<dependency>
  <groupId>com.firecrawl</groupId>
  <artifactId>firecrawl-java</artifactId>
  <version>1.2.0</version>
</dependency>
Gradle:
implementation("com.firecrawl:firecrawl-java:1.2.0")

Authenticate

import com.firecrawl.client.FirecrawlClient;

FirecrawlClient client = FirecrawlClient.builder()
    .apiKey(System.getenv("FIRECRAWL_API_KEY"))
    .build();

When To Use What

  • search: start from a query to discover relevant URLs, including site: constraints.
  • scrape: fetch and extract content from a known URL.
  • interact: run browser actions or code when a page needs clicks, forms, or post-scrape steps.

Why use it

Use search when you need discovery from a query and want Firecrawl to return URLs (and optionally scrape each result). You can constrain results to a domain with the standard site: operator, for example site:docs.firecrawl.dev crawl webhooks.

Preferred SDK method

client.search(query, options)

Simple Example

SearchData results = client.search("site:docs.firecrawl.dev webhook retries");

Complex Example

import com.firecrawl.models.SearchOptions;
import com.firecrawl.models.ScrapeOptions;
import com.firecrawl.models.JsonFormat;

SearchOptions options = SearchOptions.builder()
    .sources(List.of("web", "news"))
    .categories(List.of("research"))
    .limit(10)
    .tbs("qdr:m")
    .location("San Francisco,California,United States")
    .ignoreInvalidURLs(true)
    .timeout(60000)
    .scrapeOptions(
        ScrapeOptions.builder()
            .formats(List.of(
                "markdown",
                "links",
                JsonFormat.builder().prompt("Extract title and key endpoints.").build()
            ))
            .onlyMainContent(true)
            .waitFor(1000)
            .build()
    )
    .build();

SearchData results = client.search("site:docs.firecrawl.dev crawl webhooks", options);

Parameters

ParameterType / OptionsDescription
queryStringSearch query. Use site:domain to restrict results to a site.
options.sourcesList<Object> ("web", "news", "images" or {type: "web"} maps)Sources to search.
options.categoriesList<Object> ("github", "research", "pdf")Categories to filter results.
options.limitIntegerMaximum number of results to return.
options.tbsStringTime-based search filter (for example qdr:d, qdr:w, sbd:1,qdr:m).
options.locationStringLocation string for search results.
options.ignoreInvalidURLsBooleanExclude invalid URLs that other Firecrawl endpoints cannot scrape.
options.timeoutIntegerTimeout in milliseconds.
options.scrapeOptionsScrapeOptionsScrape options applied to each result (same fields as scrape).
options.integrationStringInternal only. Reserved for Firecrawl instrumentation.

Scrape

Why use it

Use scrape when you already have a URL and want Firecrawl to return structured content and optional formats.

Preferred SDK method

client.scrape(url, options)

Simple Example

Document doc = client.scrape(
    "https://docs.firecrawl.dev",
    ScrapeOptions.builder()
        .formats(List.of("markdown"))
        .build()
);

Complex Example

import com.firecrawl.models.ScrapeOptions;

ScrapeOptions options = ScrapeOptions.builder()
    .formats(List.of(
        "markdown",
        "links",
        Map.of("type", "json", "prompt", "Extract plan names and prices."),
        Map.of("type", "screenshot", "fullPage", true, "quality", 80, "viewport", Map.of("width", 1280, "height", 720)),
        Map.of("type", "changeTracking", "modes", List.of("git-diff"), "tag", "pricing")
    ))
    .headers(Map.of("User-Agent", "FirecrawlDocsBot/1.0"))
    .includeTags(List.of("main", "article"))
    .excludeTags(List.of("nav", "footer"))
    .onlyMainContent(true)
    .waitFor(1000)
    .mobile(false)
    .parsers(List.of(Map.of("type", "pdf", "mode", "auto", "maxPages", 5)))
    .actions(List.of(
        Map.of("type", "click", "selector", "#accept"),
        Map.of("type", "write", "text", "firecrawl"),
        Map.of("type", "press", "key", "Enter"),
        Map.of("type", "wait", "milliseconds", 1500),
        Map.of("type", "scrape")
    ))
    .location(LocationConfig.builder().country("US").languages(List.of("en-US")).build())
    .skipTlsVerification(true)
    .removeBase64Images(true)
    .blockAds(true)
    .proxy("auto")
    .maxAge(86400000L)
    .storeInCache(true)
    .build();

Document doc = client.scrape("https://example.com/pricing", options);

Parameters

ParameterType / OptionsDescription
urlStringURL to scrape.
options.formatsList<Object>Output formats to include. Strings or format objects such as {type: "json", prompt, schema} or {type: "screenshot", fullPage, quality, viewport}.
options.headersMap<String, String>Custom headers.
options.includeTagsList<String>Tags to include.
options.excludeTagsList<String>Tags to exclude.
options.onlyMainContentBooleanKeep only the main content.
options.timeoutIntegerTimeout in milliseconds.
options.waitForIntegerWait time in milliseconds before scraping.
options.mobileBooleanEmulate a mobile device.
options.parsersList<Object>Parser configuration (PDF options supported).
options.actionsList<Map<String, Object>>Browser actions to run before scraping.
options.locationLocationConfigLocation settings for the request.
options.skipTlsVerificationBooleanSkip TLS certificate verification.
options.removeBase64ImagesBooleanStrip base64 images from markdown output.
options.blockAdsBooleanEnable ad and cookie popup blocking.
options.proxyString ("basic", "stealth", "enhanced", "auto", or a custom URL)Proxy mode.
options.maxAgeLongPrefer cached data up to this age.
options.storeInCacheBooleanStore results in cache.
options.integrationStringInternal only. Reserved for Firecrawl instrumentation.

Interact

Why use it

Use interact when a page requires browser actions or code execution after a scrape starts (clicks, forms, or scripted steps).

Preferred SDK method

client.interact(jobId, code, language, timeout, origin)

Simple Example

Document doc = client.scrape("https://example.com");
String jobId = doc.getMetadata() != null ? String.valueOf(doc.getMetadata().get("scrapeId")) : null;
if (jobId == null || jobId.isBlank()) {
    throw new IllegalStateException("Missing scrapeId from scrape response");
}

BrowserExecuteResponse result = client.interact(jobId, "console.log(document.title)");

Complex Example

BrowserExecuteResponse result = client.interact(
    "<scrapeJobId>",
    "console.log(await page.title());",
    "node",
    60,
    null
);

Parameters

ParameterType / OptionsDescription
jobIdStringScrape job ID from document.metadata.scrapeId.
codeStringCode to execute in the browser session.
languageString ("python", "node", "bash")Execution language. Defaults to node when omitted.
timeoutIntegerExecution timeout in seconds (1-300).
originStringInternal only. Reserved for Firecrawl instrumentation.

Notes

  • The Java SDK interact API only supports code. Other SDKs expose prompt as a convenience.
  • Deprecated alias: scrapeExecute maps to interact, and deleteScrapeBrowser maps to stopInteractiveBrowser.

Source Of Truth

  • firecrawl/apps/java-sdk/build.gradle.kts
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/client/FirecrawlClient.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/SearchOptions.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/ScrapeOptions.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/LocationConfig.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/JsonFormat.java
  • firecrawl/apps/java-sdk/src/main/java/com/firecrawl/models/Document.java
  • firecrawl-docs/api-reference/v2-openapi.json