跳转到主要内容

前提条件

  • Remix 项目
  • 一个 Firecrawl API 密钥——免费获取

安装 SDK

npm install @mendable/firecrawl-js
将 API 密钥添加到 .env
FIRECRAWL_API_KEY=fc-YOUR-API-KEY

进行网页搜索

action 中使用 Firecrawl 处理表单提交。创建 app/routes/search.tsx
import { json, type ActionFunctionArgs } from "@remix-run/node";
import { Form, useActionData } from "@remix-run/react";
import Firecrawl from "@mendable/firecrawl-js";

const firecrawl = new Firecrawl({ apiKey: process.env.FIRECRAWL_API_KEY });

export async function action({ request }: ActionFunctionArgs) {
  const formData = await request.formData();
  const query = formData.get("query") as string;
  const results = await firecrawl.search(query, { limit: 5 });
  return json({ results: (results.web || []).map((r) => ({ title: r.title, url: r.url })) });
}

export default function SearchPage() {
  const data = useActionData<typeof action>();

  return (
    <div>
      <Form method="post">
        <input name="query" placeholder="进行网页搜索..." />
        <button type="submit">搜索</button>
      </Form>
      {data?.results?.map((r, i) => (
        <div key={i}>
          <a href={r.url}>{r.title}</a>
        </div>
      ))}
    </div>
  );
}

抓取网页

loader 中使用 Firecrawl,在请求时获取数据。创建 app/routes/scrape.tsx
import { json, type LoaderFunctionArgs } from "@remix-run/node";
import { useLoaderData } from "@remix-run/react";
import Firecrawl from "@mendable/firecrawl-js";

const firecrawl = new Firecrawl({ apiKey: process.env.FIRECRAWL_API_KEY });

export async function loader({ request }: LoaderFunctionArgs) {
  const url = new URL(request.url);
  const target = url.searchParams.get("url");
  if (!target) return json({ markdown: null });

  const result = await firecrawl.scrape(target);
  return json({ markdown: result.markdown });
}

export default function ScrapePage() {
  const { markdown } = useLoaderData<typeof loader>();

  return (
    <div>
      <h1>Scraped Content</h1>
      {markdown ? <pre>{markdown}</pre> : <p>Pass ?url= to scrape a page</p>}
    </div>
  );
}

与页面交互

使用 interact 控制正在运行的浏览器会话——点击按钮、填写表单,并提取动态内容。创建 app/routes/interact.tsx
import { json, type ActionFunctionArgs } from "@remix-run/node";
import { Form, useActionData } from "@remix-run/react";
import Firecrawl from "@mendable/firecrawl-js";

const firecrawl = new Firecrawl({ apiKey: process.env.FIRECRAWL_API_KEY });

export async function action({ request }: ActionFunctionArgs) {
  const formData = await request.formData();
  const url = formData.get("url") as string;

  const result = await firecrawl.scrape(url, { formats: ['markdown'] });
  const scrapeId = result.metadata?.scrapeId;

  await firecrawl.interact(scrapeId, { prompt: 'Search for iPhone 16 Pro Max' });
  const response = await firecrawl.interact(scrapeId, { prompt: 'Click on the first result and tell me the price' });

  await firecrawl.stopInteraction(scrapeId);

  return json({ output: response.output });
}

export default function InteractPage() {
  const data = useActionData<typeof action>();

  return (
    <div>
      <Form method="post">
        <input name="url" placeholder="要交互的 URL..." />
        <button type="submit">交互</button>
      </Form>
      {data?.output && <pre>{data.output}</pre>}
    </div>
  );
}

后续步骤

抓取 文档

包含所有 抓取 选项,包括 formats、actions 和代理

Search 文档

进行网页搜索并获取完整页面内容

交互文档

点击、填写表单并提取动态内容

Node SDK 参考

完整的 SDK 参考,涵盖爬取、map、batch 抓取 等功能