Learn how to use Firecrawl with CrewAI
FIRECRAWL_API_KEY
).crewai[tools]
package:api_key
: Optional. Specifies Firecrawl API key. Defaults is the FIRECRAWL_API_KEY
environment variable.url
: The base URL to start crawling from.page_options
: Optional.
onlyMainContent
: Optional. Only return the main content of the page excluding headers, navs, footers, etc.includeHtml
: Optional. Include the raw HTML content of the page. Will output a html key in the response.crawler_options
: Optional. Options for controlling the crawling behavior.
includes
: Optional. URL patterns to include in the crawl.exclude
: Optional. URL patterns to exclude from the crawl.generateImgAltText
: Optional. Generate alt text for images using LLMs (requires a paid plan).returnOnlyUrls
: Optional. If true, returns only the URLs as a list in the crawl status. Note: the response will be a list of URLs inside the data, not a list of documents.maxDepth
: Optional. Maximum depth to crawl. Depth 1 is the base URL, depth 2 includes the base URL and its direct children, and so on.mode
: Optional. The crawling mode to use. Fast mode crawls 4x faster on websites without a sitemap but may not be as accurate and shouldn’t be used on heavily JavaScript-rendered websites.limit
: Optional. Maximum number of pages to crawl.timeout
: Optional. Timeout in milliseconds for the crawling operation.api_key
: Optional. Specifies Firecrawl API key. Defaults is the FIRECRAWL_API_KEY
environment variable.url
: The URL to scrape.page_options
: Optional.
onlyMainContent
: Optional. Only return the main content of the page excluding headers, navs, footers, etc.includeHtml
: Optional. Include the raw HTML content of the page. Will output a html key in the response.extractor_options
: Optional. Options for LLM-based extraction of structured information from the page content
mode
: The extraction mode to use, currently supports ‘llm-extraction’extractionPrompt
: Optional. A prompt describing what information to extract from the pageextractionSchema
: Optional. The schema for the data to be extractedtimeout
: Optional. Timeout in milliseconds for the requestapi_key
: Optional. Specifies Firecrawl API key. Defaults is the FIRECRAWL_API_KEY
environment variable.query
: The search query string to be used for searching.page_options
: Optional. Options for result formatting.
onlyMainContent
: Optional. Only return the main content of the page excluding headers, navs, footers, etc.includeHtml
: Optional. Include the raw HTML content of the page. Will output a html key in the response.fetchPageContent
: Optional. Fetch the full content of the page.search_options
: Optional. Options for controlling the crawling behavior.
limit
: Optional. Maximum number of pages to crawl.