Introducing /parse
The/parse endpoint converts local or non-public documents into clean, LLM-ready data. Upload file bytes via multipart/form-data and get back Markdown, JSON, HTML, links, images, or a summary — with reading order and tables preserved.
- Turn PDF, DOCX, XLSX, HTML, and more into Markdown or structured JSON
- Up to 5x faster parsing via a Rust-based engine
- Files up to 50 MB per request
- Zero Data Retention support
When to use /parse
Use /parse when the source document is a local file or not publicly accessible by URL. If you have a public URL that points to a document, prefer /scrape — it auto-detects the file type from the extension or content type and parses it the same way.
| Source | Endpoint |
|---|---|
Public URL to a document (e.g. https://example.com/report.pdf) | POST /scrape |
| Local file or non-public bytes (PDF, DOCX, XLSX, HTML, …) | POST /parse |
Parsing
/parse endpoint
Used to upload a file and receive parsed content. The request ismultipart/form-data with a required file part and an optional options JSON part.
Supported extensions: .html, .htm, .pdf, .docx, .doc, .odt, .rtf, .xlsx, .xls.
Usage
Response
SDKs return the document object directly. cURL returns the JSON payload.Options
/parse accepts a subset of scrape options under the options field. Common settings:
formats: Array of output formats. Defaults to["markdown"]. Supported:markdown,html,rawHtml,links,images,summary, andjson(with a schema or prompt).onlyMainContent: Only return the main content of the document. Defaults totrue.includeTags/excludeTags: Tag-level inclusion or exclusion (HTML inputs).timeout: Request timeout in milliseconds. Defaults to30000, max300000.parsers: File-parser controls. For PDFs, set{ "type": "pdf", "mode": "fast" | "auto" | "ocr", "maxPages": <int> }.
/parse does not support browser-only options like actions, waitFor, location, mobile, or change tracking.PDF parser modes
cURL
fast: text-only extraction, fastest path.auto(default): text-first with OCR fallback for image-only pages.ocr: OCR every page — use for scanned documents.
Structured JSON output
Pass a JSON schema or prompt to extract structured data directly from the document:cURL
Considerations
- Maximum file size is 50 MB per request.
- Parsing very large or scanned PDFs in
ocrmode may take longer — increasetimeoutor usemaxPagesto bound the work. - For batches of files, call
/parseper file in parallel; there is no batch upload variant.
Are you an AI agent that needs a Firecrawl API key? See firecrawl.dev/agent-onboarding/SKILL.md for automated onboarding instructions.

