# Build with AI Source: https://docs.firecrawl.dev/ai-onboarding Everything you need to onboard your AI agent to Firecrawl. If you're developing with AI, Firecrawl offers several resources to improve your experience. Firecrawl ships with **skills** — self-contained knowledge packs that AI coding agents discover and use automatically. One install command gives agents three complete skill segments: CLI skills for live web work, build skills for integrating Firecrawl into application code, and workflow skills for producing repeatable deliverables. Agents like Claude Code, Cursor, Antigravity, and OpenCode can self-onboard with a single command — no human setup required after the API key exists. * [Prerequisite: Create an API Key](#prerequisite-create-an-api-key) * [Skills + CLI](#skills-cli) * [Using Firecrawl as a Tool](#using-firecrawl-as-a-tool) * [Agentic Debugging](#agentic-debugging) * [Firecrawl MCP Server](#firecrawl-mcp-server) * [Firecrawl Docs for Agents](#firecrawl-docs-for-agents) * [Quick Start Guides](#quick-start-guides) * [Agent Harnesses](#agent-harnesses) * [SDKs](#sdks) ## Prerequisite: Create an API Key Currently, we require a human to create a Firecrawl account. Once you have an account, you'll need to [create an API key](https://www.firecrawl.dev/app/api-keys). With an API key, your agent can handle the rest — installing the skills, authenticating the CLI, wiring up MCP, and making calls on your behalf. Sign up and grab an API key to start using Firecrawl. ## Skills + CLI The [Firecrawl CLI](/sdks/cli) lets your agent search, scrape, interact, crawl, map, extract, and run agent jobs from the terminal. It's built for humans, AI agents, and CI/CD pipelines. The Firecrawl **skills** are self-contained knowledge packs that AI coding agents like Claude Code, Antigravity, and OpenCode discover and use automatically. A single install command sets up everything — the CLI tools for live web work, the build skills for integrating Firecrawl into application code, and the workflow skills for producing repeatable deliverables: ```bash theme={null} npx -y firecrawl-cli@latest init --all --browser ``` * `--all` installs every Firecrawl skill segment (CLI, build, workflows) to every detected AI coding agent on the machine * `--browser` opens the browser for Firecrawl authentication automatically After install, verify everything is working: ```bash theme={null} firecrawl --status firecrawl scrape "https://firecrawl.dev" ``` To reinstall or scope to a specific agent later: ```bash theme={null} firecrawl setup skills # CLI + build skills firecrawl setup workflows # workflow skills ``` ### What the install gives you The install sets up three categories of skills that cover every way an agent uses Firecrawl. Each segment lives in its own repo so it can evolve independently: * [`firecrawl/cli`](https://github.com/firecrawl/cli) — CLI skills for live web work * [`firecrawl/skills`](https://github.com/firecrawl/skills) — build skills for app integration * [`firecrawl/firecrawl-workflows`](https://github.com/firecrawl/firecrawl-workflows) — workflow skills for repeatable deliverables **CLI skills** — for live web work during an agent session: | Skill | Purpose | | -------------------- | ------------------------------------------------- | | `firecrawl/cli` | Overall CLI command workflow | | `firecrawl-search` | Search the web and discover pages | | `firecrawl-scrape` | Extract clean content from a known URL | | `firecrawl-interact` | Interact with scraped pages using prompts or code | | `firecrawl-crawl` | Bulk-extract content from an entire site | | `firecrawl-map` | Discover all URLs on a domain | | `firecrawl-agent` | Run autonomous web data gathering with a job | **Build skills** — for integrating Firecrawl into application code: | Skill | Purpose | | ---------------------------- | ---------------------------------------------------- | | `firecrawl-build` | Choose the right Firecrawl endpoint for your product | | `firecrawl-build-onboarding` | Auth and project setup | | `firecrawl-build-scrape` | Implement scraping in app code | | `firecrawl-build-search` | Implement search in app code | | `firecrawl-build-interact` | Implement page interaction in app code | | `firecrawl-build-crawl` | Implement crawling in app code | | `firecrawl-build-map` | Implement URL discovery in app code | | `firecrawl-build-parse` | Implement document parsing in app code | **Workflow skills** — outcome-focused skills that produce a concrete deliverable from Firecrawl web data: | Skill | Outcome | | -------------------------------- | --------------------------------------------------------------------- | | `firecrawl-workflows` | Umbrella skill for choosing the right workflow | | `firecrawl-deep-research` | Multi-source sourced research reports | | `firecrawl-seo-audit` | Site maps, on-page SEO checks, SERP comparison, and prioritized fixes | | `firecrawl-lead-research` | Pre-meeting company and person intelligence briefs | | `firecrawl-lead-gen` | Prospect list generation from databases and directories | | `firecrawl-qa` | Live-site QA reports with issues and reproduction steps | | `firecrawl-competitive-intel` | Recurring pricing, feature, and changelog monitoring | | `firecrawl-market-research` | Market, financial, earnings, and industry research | | `firecrawl-research-papers` | Literature reviews from papers, PDFs, and whitepapers | | `firecrawl-company-directories` | Directory extraction into structured company lists | | `firecrawl-dashboard-reporting` | Metrics extraction from dashboards and internal web tools | | `firecrawl-knowledge-base` | LLM-ready reference docs, RAG chunks, training data, or docs mirrors | | `firecrawl-knowledge-ingest` | Auth-gated or JS-heavy docs portal ingestion | | `firecrawl-demo-walkthrough` | Product flow walkthroughs and UX teardown reports | | `firecrawl-shop` | Product research and shopping recommendations | | `firecrawl-website-design-clone` | Extract a website's design system into an agent-ready `DESIGN.md` | ### Choose your path All three skill categories use the same install. The difference is what happens next: Use this when you need web data during your current session — searching the web, scraping known URLs, interacting with scraped pages, crawling docs, mapping a site, or running an agent job. The default flow: 1. Start with **search** when you need discovery 2. Move to **scrape** when you have a URL 3. Use **interact** when the scraped page needs follow-up actions 4. Use **map** or **crawl** when you need many URLs or pages 5. Use **agent** when the task is open-ended and needs autonomous discovery ```bash theme={null} # Search the web firecrawl search "best open-source web crawlers" # Scrape a page into clean markdown firecrawl scrape https://docs.firecrawl.dev # Crawl a whole site firecrawl crawl https://docs.firecrawl.dev ``` Use this when you're building an application, agent, or workflow that calls the Firecrawl API from code. The build skills help with picking the right endpoint, wiring up the SDK, and running a smoke test. The agent answers one key question — *what should Firecrawl do in the product?* — and the build skills route to `/search`, `/scrape`, `/interact`, `/parse`, `/crawl`, `/map`, or `/agent` accordingly. Use this when the goal is a finished artifact — a research report, SEO audit, QA report, lead list, knowledge base, competitive intel digest, or a cloned design system — not raw web data or product code. Workflow skills infer from context first and only ask short clarifying questions when an input would block the work. They also call out independently parallelizable units so sub-agents can fan out across competitors, pages, or sources. Pick a workflow directly, or let the umbrella `firecrawl-workflows` skill route the request: ```bash theme={null} # Multi-source research brief on a topic "Use firecrawl-deep-research to write a brief on AI agent frameworks" # Pre-meeting intelligence for a sales call "Use firecrawl-lead-research to brief me on stripe.com before my 3pm call" # On-page SEO audit with prioritized fixes "Use firecrawl-seo-audit on https://example.com" # Clone a site's design system into DESIGN.md "Use firecrawl-website-design-clone on https://linear.app" ``` If you prefer not to install anything, agents can call the Firecrawl REST API directly. Set the API key and hit the endpoints: * `POST https://api.firecrawl.dev/v2/search` — discover pages by query * `POST https://api.firecrawl.dev/v2/scrape` — extract clean markdown from a URL * `POST https://api.firecrawl.dev/v2/interact` — interact with a scraped page * `POST https://api.firecrawl.dev/v2/crawl` — bulk-extract an entire site * `POST https://api.firecrawl.dev/v2/map` — discover URLs on a domain * `POST https://api.firecrawl.dev/v2/agent` — run autonomous web data gathering Auth header: `Authorization: Bearer fc-YOUR_API_KEY` The full onboarding definition is available at [`firecrawl.dev/agent-onboarding/SKILL.md`](https://www.firecrawl.dev/agent-onboarding/SKILL.md) — agents can fetch it directly for self-onboarding. Live web work during an agent session — search, scrape, interact, map, crawl, and run agent jobs from the terminal. Integrate Firecrawl into application code — pick the right endpoint, wire up the SDK, and ship a verified integration. Produce repeatable deliverables — research briefs, SEO audits, QA reports, lead lists, knowledge bases, and design clones. ## Using Firecrawl as a Tool Firecrawl gives agents five core tools for working with the web. Each tool maps to an API endpoint and a CLI command. Agents pick the right tool based on what they need: Start here when you don't have a URL yet. Search returns relevant web pages for a natural-language query, with optional full-page content included in the results. ```bash theme={null} # CLI firecrawl search "latest OpenAI API pricing" ``` ```bash theme={null} # REST API curl -X POST https://api.firecrawl.dev/v2/search \ -H "Authorization: Bearer fc-YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"query": "latest OpenAI API pricing"}' ``` **When to use:** Research tasks, finding documentation, competitive analysis, answering questions that require up-to-date web information. Use this when you already have a URL and need clean, LLM-ready content. Scrape converts any web page into markdown, HTML, or structured data — handling JavaScript rendering, anti-bot measures, and messy HTML automatically. ```bash theme={null} # CLI firecrawl scrape https://docs.stripe.com/api/charges ``` ```bash theme={null} # REST API curl -X POST https://api.firecrawl.dev/v2/scrape \ -H "Authorization: Bearer fc-YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"url": "https://docs.stripe.com/api/charges"}' ``` **When to use:** Reading documentation, extracting article content, pulling data from a known page, converting web pages to context for LLMs. Crawl recursively follows links from a starting URL and scrapes every page it finds. Use it when you need content from an entire site or documentation set, not just a single page. ```bash theme={null} # CLI firecrawl crawl https://docs.firecrawl.dev --limit 50 ``` ```bash theme={null} # REST API curl -X POST https://api.firecrawl.dev/v2/crawl \ -H "Authorization: Bearer fc-YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"url": "https://docs.firecrawl.dev", "limit": 50}' ``` **When to use:** Ingesting full documentation sites, building knowledge bases, migrating content, training data collection. Map rapidly discovers every indexed URL on a domain without scraping the content. Use it when you need to understand a site's structure or find specific pages before scraping them. ```bash theme={null} # CLI firecrawl map https://docs.firecrawl.dev ``` ```bash theme={null} # REST API curl -X POST https://api.firecrawl.dev/v2/map \ -H "Authorization: Bearer fc-YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"url": "https://docs.firecrawl.dev"}' ``` **When to use:** Site audits, finding specific pages on a large site, understanding site structure before a targeted crawl. Interact lets agents continue from a scrape using prompts or code. Use it when a scraped page requires clicks, form fills, navigation, or follow-up extraction. ```bash theme={null} # CLI firecrawl scrape https://example.com firecrawl interact "Click the pricing tab and extract the plan names" ``` ```bash theme={null} # REST API curl -X POST https://api.firecrawl.dev/v2/interact \ -H "Authorization: Bearer fc-YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"scrapeId": "scrape-id-from-scrape", "prompt": "Click the pricing tab and extract the plan names"}' ``` **When to use:** Continuing from a scrape, navigating dynamic pages, filling forms, and extracting data after page actions. ### How agents chain tools together Most agent workflows combine multiple tools. A typical pattern: 1. **Search** to find relevant pages → get a list of URLs 2. **Scrape** the most relevant URLs → get clean content 3. **Interact** when the scraped page needs follow-up actions 4. **Agent** when the task needs autonomous discovery or structured multi-page extraction For bulk work, agents use **Map** to discover URLs first, then **Crawl** or selectively **Scrape** the pages they need. ## Agentic Debugging When a Firecrawl call fails or returns unexpected results, your agent doesn't have to escalate to a human. The [`/support/ask`](/api-reference/endpoint/ask) endpoint is an AI support agent built for **agent-to-agent** communication — it diagnoses issues with your jobs, account, and API usage, then returns a verified answer with machine-readable fix parameters your agent can apply directly. Wire it into your agent's error-handling flow so it can self-recover from scraping failures, crawl issues, and configuration problems — typically in 15–30 seconds, no human in the loop. ### How it works 1. **Your agent describes the problem** — a natural-language question describing the issue. 2. **The support agent investigates** — it inspects job logs, account state, documentation, and source code. 3. **The support agent validates** — when possible, it tests a fix against the live Firecrawl API (e.g., retrying a scrape with adjusted parameters). 4. **Your agent gets a verified answer** — a prose `answer`, machine-readable `fixParameters` to apply directly, and `validation` results showing whether the fix was tested. ### Example Send a question, plus an optional `rationale` to give the support agent context about what your end user is trying to accomplish: ```bash theme={null} curl -X POST https://api.firecrawl.dev/v2/support/ask \ -H "Authorization: Bearer fc-YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "question": "my crawl returned 3 pages but I expected 50", "rationale": "user is on their third failed crawl attempt today" }' ``` The response includes an `answer`, a `confidence` rating, optional `fixParameters` (e.g., `{"waitFor": 5000}`) your agent can pass to the next call, and `validation` showing whether the fix was tested against the live API. Full request and response schema for `/support/ask`, including status codes and the feedback envelope returned when the agent gets stuck. ## Firecrawl MCP Server MCP is an open protocol that standardizes how applications provide context to LLMs. Among other benefits, it gives LLMs tools to act on your behalf. Our [MCP server](https://github.com/firecrawl/firecrawl-mcp-server) is open-source and covers our full API surface — search, scrape, interact, crawl, map, extract, and agent. Use the remote hosted URL: ``` https://mcp.firecrawl.dev/{FIRECRAWL_API_KEY}/v2/mcp ``` Or add the local server to any MCP client: ```json theme={null} { "mcpServers": { "firecrawl": { "command": "npx", "args": ["-y", "firecrawl-mcp"], "env": { "FIRECRAWL_API_KEY": "fc-YOUR-API-KEY" } } } } ``` View installation instructions for Cursor, Claude Desktop, Windsurf, VS Code, and more. ## Firecrawl Docs for Agents You can give your agent current Firecrawl docs in a context-aware way. Agents can self-onboard by pulling these resources directly — no human wiring required. Every page has a markdown version. Append `.md` to any docs URL, or use the page action menu to copy the page as markdown. ``` Docs for this page: https://docs.firecrawl.dev/ai-onboarding.md ``` Give your agent all of our docs in a single file. ``` Here are the Firecrawl docs: https://docs.firecrawl.dev/llms-full.txt ``` A shorter index is also available at `https://docs.firecrawl.dev/llms.txt`. For a structured approach using MCP tools, connect the Firecrawl MCP server in any MCP client (Cursor, Claude Code, Claude Desktop, Windsurf). See the [MCP Server](/mcp-server) page for install commands. Every page includes a contextual action menu (copy, view as markdown, open in ChatGPT, open in Claude) so agents and humans can move pages between tools in one click. ## Quick Start Guides Drop-in quickstarts for the stacks agents build on most often. Point your agent at any of these to scaffold a working Firecrawl integration end-to-end. Prefer to let Cursor drive? One-click install the Firecrawl MCP server and start prompting in Cursor: Open in Cursor — Add Firecrawl MCP server Server-side JavaScript and TypeScript with the Firecrawl Node SDK. Scrape, search, and crawl from Next.js route handlers and server actions. Use Firecrawl from scripts, notebooks, and backend services. Build async Python APIs that search, scrape, and extract. Run Firecrawl at the edge with Workers. Call Firecrawl from Vercel serverless functions. Invoke Firecrawl from Lambda handlers. Use Firecrawl inside Supabase Deno runtime. Idiomatic Go SDK for search, scrape, and crawl. Typed Rust SDK for Firecrawl. Add Firecrawl to Laravel apps via the PHP SDK. Drop Firecrawl into Ruby on Rails. See the full list of quickstarts (Express, NestJS, Fastify, Hono, Bun, Remix, Nuxt, SvelteKit, Astro, Mastra, Django, Flask, Elixir, Java, Spring Boot, .NET, ASP.NET Core, and more) in the left sidebar. ## Agent Harnesses Firecrawl works with the runtimes and frameworks agents actually live inside — coding agents, agent SDKs, and model aggregators. Most coding harnesses can auto-discover the Firecrawl skills via `npx -y firecrawl-cli@latest init --all --browser`; the rest call Firecrawl as a tool over MCP or the REST API. Anthropic's CLI — set up Firecrawl MCP in Claude Code. IDE agent — one-click install Firecrawl MCP in Cursor. Wire Firecrawl MCP into OpenCode. Wire Firecrawl MCP into OpenAI Codex CLI. Pair any OpenRouter model with Firecrawl web tools. Wire Firecrawl MCP into Sourcegraph Amp. Agentic IDE — set up Firecrawl MCP in Windsurf. Add Firecrawl MCP to Google's agentic IDE. Wire Firecrawl MCP into Google Gemini CLI. Use Firecrawl as a tool with Hermes models. Firecrawl tools inside Microsoft AutoGen multi-agent teams. ## SDKs Official, typed SDKs covering the full Firecrawl API surface. Point your agent at the language matching your stack. Firecrawl also has first-class SDK bindings for the major LLM SDKs and agent frameworks — see [LLM SDKs and Frameworks](/developer-guides/llm-sdks-and-frameworks/openai) for OpenAI, Anthropic, Gemini, Google ADK, Vercel AI SDK, LangChain, LangGraph, LlamaIndex, Mastra, and ElevenAgents. # Scraping Amazon Source: https://docs.firecrawl.dev/developer-guides/common-sites/amazon Extract product data, prices, and reviews from Amazon using Firecrawl Amazon is one of the most scraped e-commerce sites. This guide shows you how to effectively extract product data, pricing, reviews, and search results using Firecrawl's powerful features. ## Setup ```bash theme={null} npm install @mendable/firecrawl-js zod ``` ## Overview When scraping Amazon, you'll typically want to: * Extract product information (title, price, availability) * Get customer reviews and ratings * Monitor price changes * Search for products programmatically * Track competitor listings ## Scrape with JSON Mode Extract structured product data using Zod schemas. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; import { z } from 'zod'; // Define Zod schema const ProductSchema = z.object({ title: z.string(), price: z.string(), rating: z.number(), availability: z.string(), features: z.array(z.string()) }); const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const result = await firecrawl.scrape('https://www.amazon.com/dp/B0DZZWMB2L', { formats: [{ type: 'json', schema: z.toJSONSchema(ProductSchema) }], }); // Parse and validate with Zod const jsonData = typeof result.json === 'string' ? JSON.parse(result.json) : result.json; const validated = ProductSchema.parse(jsonData); console.log('✅ Validated product data:'); console.log(validated); ``` ## Search Find products on Amazon. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const searchResult = await firecrawl.search('gaming laptop site:amazon.com', { limit: 10, sources: [{ type: 'web' }], // { type: 'news' }, { type: 'images' } scrapeOptions: { formats: ['markdown'] } }); console.log(searchResult); ``` ## Scrape Scrape a single Amazon product page. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const result = await firecrawl.scrape('https://www.amazon.com/ASUS-ROG-Strix-Gaming-Laptop/dp/B0DZZWMB2L', { formats: ['markdown'], // i.e. html, links, etc. onlyMainContent: true }); console.log(result); ``` ## Map Discover all available URLs on Amazon product or category pages. Note: Map returns URLs only, without content. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const mapResult = await firecrawl.map('https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics'); console.log(mapResult.links); // Returns array of URLs without content ``` ## Crawl Crawl multiple pages from Amazon category or search results. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const crawlResult = await firecrawl.crawl('https://www.amazon.com/s?k=mechanical+keyboards', { limit: 10, scrapeOptions: { formats: ['markdown'] } }); console.log(crawlResult.data); ``` ## Batch Scrape Scrape multiple Amazon product URLs simultaneously. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); // Wait for completion const job = await firecrawl.batchScrape([ 'https://www.amazon.com/ASUS-ROG-Strix-Gaming-Laptop/dp/B0DZZWMB2L', 'https://www.amazon.com/Razer-Blade-Gaming-Laptop-Lightweight/dp/B0FP47DNFQ', 'https://www.amazon.com/HP-2025-Omen-Gaming-Laptop/dp/B0FL4RMGSH'], { options: { formats: ['markdown'] }, pollInterval: 2, timeout: 120 } ); console.log(job.status, job.completed, job.total); console.log(job); ``` # Scraping Etsy Source: https://docs.firecrawl.dev/developer-guides/common-sites/etsy Extract handmade products, shop data, and pricing from Etsy marketplace Etsy is a global marketplace for unique and creative goods. This guide shows you how to extract product listings, shop information, reviews, and trending items using Firecrawl. ## Setup ```bash theme={null} npm install @mendable/firecrawl-js zod ``` ## Overview When scraping Etsy, you'll typically want to: * Extract product listings and variations * Get shop information and ratings * Monitor trending items and categories * Track pricing and sales data * Extract customer reviews ## Scrape with JSON Mode Extract structured listing data using Zod schemas. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; import { z } from 'zod'; // Define Zod schema const ListingSchema = z.object({ title: z.string(), price: z.string(), shopName: z.string(), rating: z.number() }); const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const result = await firecrawl.scrape('https://www.etsy.com/listing/1844315896/handmade-925-sterling-silver-jewelry-set', { formats: [{ type: 'json', schema: z.toJSONSchema(ListingSchema) }], }); // Parse and validate with Zod const jsonData = typeof result.json === 'string' ? JSON.parse(result.json) : result.json; const validated = ListingSchema.parse(jsonData); console.log('✅ Validated listing data:'); console.log(validated); ``` ## Search Find products on Etsy marketplace. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const searchResult = await firecrawl.search('handmade jewelry site:etsy.com', { limit: 10, sources: [{ type: 'web' }], // { type: 'news' }, { type: 'images' } scrapeOptions: { formats: ['markdown'] } }); console.log(searchResult); ``` ## Scrape Scrape a single Etsy product listing. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const result = await firecrawl.scrape('https://www.etsy.com/listing/1844315896/handmade-925-sterling-silver-jewelry-set', { formats: ['markdown'], // i.e. html, links, etc. onlyMainContent: true }); console.log(result); ``` ## Map Discover all available URLs in an Etsy shop or category. Note: Map returns URLs only, without content. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const mapResult = await firecrawl.map('https://www.etsy.com/shop/YourShopName'); console.log(mapResult.links); // Returns array of URLs without content ``` ## Crawl Crawl multiple pages from Etsy shop or category. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const crawlResult = await firecrawl.crawl('https://www.etsy.com/c/jewelry', { limit: 10, scrapeOptions: { formats: ['markdown'] } }); console.log(crawlResult.data); ``` ## Batch Scrape Scrape multiple Etsy listing URLs simultaneously. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); // Wait for completion const job = await firecrawl.batchScrape([ 'https://www.etsy.com/listing/1844315896/handmade-925-sterling-silver-jewelry-set', 'https://www.etsy.com/market/handmade_jewelry', 'https://www.etsy.com/market/jewelry_handmade'], { options: { formats: ['markdown'] }, pollInterval: 2, timeout: 120 } ); console.log(job.status, job.completed, job.total); console.log(job); ``` # Scraping GitHub Source: https://docs.firecrawl.dev/developer-guides/common-sites/github Learn how to scrape GitHub using Firecrawl's core features Learn how to use Firecrawl's core features to scrape GitHub repositories, issues, and documentation. ## Setup ```bash theme={null} npm install @mendable/firecrawl-js zod ``` ## Scrape with JSON Mode Extract structured data from repositories using Zod schemas. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; import { z } from 'zod'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const result = await firecrawl.scrape('https://github.com/firecrawl/firecrawl', { formats: [{ type: 'json', schema: z.object({ name: z.string(), description: z.string(), stars: z.number(), forks: z.number(), language: z.string(), topics: z.array(z.string()) }) }] }); console.log(result.json); ``` ## Search Find repositories, issues, or documentation on GitHub. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const searchResult = await firecrawl.search('machine learning site:github.com', { limit: 10, sources: [{ type: 'web' }], // { type: 'news' }, { type: 'images' } scrapeOptions: { formats: ['markdown'] } }); console.log(searchResult); ``` ## Scrape Scrape a single GitHub page - repository, issue, or file. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const result = await firecrawl.scrape('https://github.com/firecrawl/firecrawl', { formats: ['markdown'] // i.e. html, links, etc. }); console.log(result); ``` ## Map Discover all available URLs in a repository or documentation site. Note: Map returns URLs only, without content. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const mapResult = await firecrawl.map('https://github.com/vercel/next.js/tree/canary/docs'); console.log(mapResult.links); // Returns array of URLs without content ``` ## Crawl Crawl multiple pages from a repository or documentation. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const crawlResult = await firecrawl.crawl('https://github.com/facebook/react/wiki', { limit: 10, scrapeOptions: { formats: ['markdown'] } }); console.log(crawlResult.data); ``` ## Batch Scrape Scrape multiple GitHub URLs simultaneously. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); // Wait for completion const job = await firecrawl.batchScrape([ 'https://github.com/vercel/next.js', 'https://github.com/facebook/react', 'https://github.com/microsoft/typescript'], { options: { formats: ['markdown'] }, pollInterval: 2, timeout: 120 } ); console.log(job.status, job.completed, job.total); console.log(job); ``` ## Batch Scrape with JSON Mode Extract structured data from multiple repositories at once. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; import { z } from 'zod'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); // Wait for completion const job = await firecrawl.batchScrape([ 'https://github.com/vercel/next.js', 'https://github.com/facebook/react'], { options: { formats: [{ type: 'json', schema: z.object({ name: z.string(), description: z.string(), stars: z.number(), language: z.string() }) }] }, pollInterval: 2, timeout: 120 } ); console.log(job.status, job.completed, job.total); console.log(job); ``` # Scraping Wikipedia Source: https://docs.firecrawl.dev/developer-guides/common-sites/wikipedia Extract articles, infoboxes, and build knowledge graphs from Wikipedia Learn how to effectively scrape Wikipedia for research, knowledge extraction, and building AI applications. ## Setup ```bash theme={null} npm install @mendable/firecrawl-js zod ``` ## Use Cases * Research automation and fact-checking * Building knowledge graphs * Multi-language content extraction * Educational content aggregation * Entity information extraction ## Scrape with JSON Mode Extract structured data from Wikipedia articles using Zod schemas. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; import { z } from 'zod'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const result = await firecrawl.scrape('https://en.wikipedia.org/wiki/JavaScript', { formats: [{ type: 'json', schema: z.object({ name: z.string(), creator: z.string(), firstAppeared: z.string(), typingDiscipline: z.string(), website: z.string() }) }] }); console.log(result.json); ``` ## Search Find articles on Wikipedia. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const searchResult = await firecrawl.search('quantum computing site:en.wikipedia.org', { limit: 10, sources: [{ type: 'web' }], // { type: 'news' }, { type: 'images' } scrapeOptions: { formats: ['markdown'] } }); console.log(searchResult); ``` ## Scrape Scrape a single Wikipedia article. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const result = await firecrawl.scrape('https://en.wikipedia.org/wiki/Artificial_intelligence', { formats: ['markdown'], // i.e. html, links, etc. onlyMainContent: true }); console.log(result); ``` ## Map Discover all available URLs in a Wikipedia portal or category. Note: Map returns URLs only, without content. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const mapResult = await firecrawl.map('https://en.wikipedia.org/wiki/Portal:Computer_science'); console.log(mapResult.links); // Returns array of URLs without content ``` ## Crawl Crawl multiple pages from Wikipedia documentation or categories. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); const crawlResult = await firecrawl.crawl('https://en.wikipedia.org/wiki/Portal:Artificial_intelligence', { limit: 10, scrapeOptions: { formats: ['markdown'] } }); console.log(crawlResult.data); ``` ## Batch Scrape Scrape multiple Wikipedia URLs simultaneously. ```typescript theme={null} import FirecrawlApp from '@mendable/firecrawl-js'; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY }); // Wait for completion const job = await firecrawl.batchScrape([ 'https://en.wikipedia.org/wiki/Machine_learning', 'https://en.wikipedia.org/wiki/Artificial_intelligence', 'https://en.wikipedia.org/wiki/Deep_learning'], { options: { formats: ['markdown'] }, pollInterval: 2, timeout: 120 } ); console.log(job.status, job.completed, job.total); console.log(job); ``` # Building an AI Research Assistant with Firecrawl and AI SDK Source: https://docs.firecrawl.dev/developer-guides/cookbooks/ai-research-assistant-cookbook Build a complete AI-powered research assistant with web scraping and search capabilities Build a complete AI-powered research assistant that can scrape websites and search the web to answer questions. The assistant automatically decides when to use web scraping or search tools to gather information, then provides comprehensive answers based on collected data.