SDKs
Llamaindex
Firecrawl integrates with LlamaIndex as a document reader.
Installation
pip install firecrawl-py
Usage
Using FireCrawl to Gather an Entire Website
from llama_index.readers.web import FireCrawlWebReader
# Initialize FireCrawlWebReader to crawl a website
firecrawl_reader = FireCrawlWebReader(
api_key="<your_api_key>", # Replace with your actual API key from https://www.firecrawl.dev/
mode="scrape", # Choose between "crawl" and "scrape" for single page scraping
params={"additional": "parameters"} # Optional additional parameters
)
# Load documents from a single page URL
documents = firecrawl_reader.load_data(url="http://paulgraham.com/")
index = SummaryIndex.from_documents(documents)
# Set Logging to DEBUG for more detailed outputs
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
display(Markdown(f"<b>{response}</b>"))
Using FireCrawl to Gather a Single Page
from llama_index.readers.web import FireCrawlWebReader
# Initialize the FireCrawlWebReader with your API key and desired mode
firecrawl_reader = FireCrawlWebReader(
api_key="<your_api_key>", # Replace with your actual API key from https://www.firecrawl.dev/
mode="scrape", # Choose between "crawl" and "scrape"
params={"additional": "parameters"} # Optional additional parameters
)
# Load documents from a specified URL
documents = firecrawl_reader.load_data(url="http://paulgraham.com/worked.html")
index = SummaryIndex.from_documents(documents)
# Set Logging to DEBUG for more detailed outputs
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
display(Markdown(f"<b>{response}</b>"))