Note: this example is using v0 version of the Firecrawl API. You can install the 0.0.20 version for the Python SDK or the 0.0.36 for the Node SDK.
Setup
Install our python dependencies, including groq and firecrawl-py.Getting your Groq and Firecrawl API Keys
To use Groq and Firecrawl, you will need to get your API keys. You can get your Groq API key from here and your Firecrawl API key from here.Load website with Firecrawl
To be able to get all the data from a website page and make sure it is in the cleanest format, we will use Firecrawl. It handles by-passing JS-blocked websites, extracting the main content, and outputting in a LLM-readable format for increased accuracy. Here is how we will scrape a website url using Firecrawl. We will also set apageOptions
for only extracting the main content (onlyMainContent: True
) of the website page - excluding the navs, footers, etc.