跳转到主要内容

前提条件

进行网页搜索

Firecrawl 可通过 REST API 在 Ruby 中使用 net/http
require "net/http"
require "json"
require "uri"

api_key = ENV.fetch("FIRECRAWL_API_KEY")

uri = URI("https://api.firecrawl.dev/v2/search")
request = Net::HTTP::Post.new(uri)
request["Authorization"] = "Bearer #{api_key}"
request["Content-Type"] = "application/json"
request.body = { query: "firecrawl web scraping", limit: 5 }.to_json

response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) { |http| http.request(request) }
results = JSON.parse(response.body)

results["data"]["web"].each do |result|
  puts "#{result['title']} - #{result['url']}"
end

抓取网页

uri = URI("https://api.firecrawl.dev/v2/scrape")
request = Net::HTTP::Post.new(uri)
request["Authorization"] = "Bearer #{api_key}"
request["Content-Type"] = "application/json"
request.body = { url: "https://example.com" }.to_json

response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) { |http| http.request(request) }
data = JSON.parse(response.body)

puts data.dig("data", "markdown")
{
  "success": true,
  "data": {
    "markdown": "# Example Domain\n\nThis domain is for use in illustrative examples...",
    "metadata": {
      "title": "Example Domain",
      "sourceURL": "https://example.com"
    }
  }
}

与页面交互

先抓取页面,再通过自然语言 prompt 继续操作。
uri = URI("https://api.firecrawl.dev/v2/scrape")
request = Net::HTTP::Post.new(uri)
request["Authorization"] = "Bearer #{api_key}"
request["Content-Type"] = "application/json"
request.body = { url: "https://www.amazon.com", formats: ["markdown"] }.to_json

response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) { |http| http.request(request) }
scrape_id = JSON.parse(response.body).dig("data", "metadata", "scrapeId")

interact_uri = URI("https://api.firecrawl.dev/v2/scrape/#{scrape_id}/interact")
interact_req = Net::HTTP::Post.new(interact_uri)
interact_req["Authorization"] = "Bearer #{api_key}"
interact_req["Content-Type"] = "application/json"
interact_req.body = { prompt: "Search for iPhone 16 Pro Max" }.to_json

interact_resp = Net::HTTP.start(interact_uri.hostname, interact_uri.port, use_ssl: true) { |http| http.request(interact_req) }
puts JSON.parse(interact_resp.body)

# 停止会话
delete_uri = URI("https://api.firecrawl.dev/v2/scrape/#{scrape_id}/interact")
delete_req = Net::HTTP::Delete.new(delete_uri)
delete_req["Authorization"] = "Bearer #{api_key}"
Net::HTTP.start(delete_uri.hostname, delete_uri.port, use_ssl: true) { |http| http.request(delete_req) }

后续步骤

抓取 文档

所有 抓取 选项,包括 formats、actions 和代理

Search 文档

进行网页搜索并获取完整页面内容

API 参考

完整的 REST API 文档

交互文档

点击、填写表单并提取动态内容