> ## Documentation Index
> Fetch the complete documentation index at: https://docs.firecrawl.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Parse

Upload a local or non-public document and convert it into clean, LLM-ready data. `/parse` accepts file bytes via `multipart/form-data` and returns Markdown, JSON, HTML, links, images, or a summary — with reading order and tables preserved.

* Turn PDF, DOCX, XLSX, HTML, and more into Markdown or structured JSON
* Up to **5x faster** parsing via a Rust-based engine
* Files up to **50 MB** per request
* Zero Data Retention support

## When to use `/parse`

Use `/parse` when the source document is **a local file** or **not publicly accessible by URL**. If you have a public URL that points to a document, prefer [`/scrape`](/api-reference/endpoint/scrape) — it auto-detects the file type from the extension or content type and parses it the same way.

| Source                                                           | Endpoint                                         |
| ---------------------------------------------------------------- | ------------------------------------------------ |
| Public URL to a document (e.g. `https://example.com/report.pdf`) | [`POST /scrape`](/api-reference/endpoint/scrape) |
| Local file or non-public bytes (PDF, DOCX, XLSX, HTML, …)        | `POST /parse` (this endpoint)                    |

<Tip>
  **Using Firecrawl through MCP?** Use `firecrawl_parse` for local files. Local MCP can read the file directly when configured with `FIRECRAWL_API_URL`. Remote hosted MCP returns a short-lived upload command first, then parses the returned `uploadRef`. Public document URLs should still use `/scrape`.
</Tip>


## OpenAPI

````yaml api-reference/v2-openapi.json POST /parse
openapi: 3.0.0
info:
  title: Firecrawl API
  version: v2
  description: >-
    API for interacting with Firecrawl services to perform web scraping and
    crawling tasks.
  contact:
    name: Firecrawl Support
    url: https://firecrawl.dev/support
    email: support@firecrawl.dev
servers:
  - url: https://api.firecrawl.dev/v2
security:
  - bearerAuth: []
paths:
  /parse:
    post:
      tags:
        - Scraping
      summary: Upload and parse a file
      operationId: parseFile
      requestBody:
        required: true
        content:
          multipart/form-data:
            schema:
              type: object
              properties:
                file:
                  type: string
                  format: binary
                  description: >-
                    The file bytes to parse. Supported extensions: .html, .htm,
                    .pdf, .docx, .doc, .odt, .rtf, .xlsx, .xls.
                options:
                  $ref: '#/components/schemas/ParseOptions'
              required:
                - file
            encoding:
              options:
                contentType: application/json
      responses:
        '200':
          description: Successful response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ScrapeResponse'
        '400':
          description: Bad request
          content:
            application/json:
              schema:
                type: object
                properties:
                  success:
                    type: boolean
                    example: false
                  code:
                    type: string
                    example: BAD_REQUEST
                  error:
                    type: string
                    example: Invalid multipart form-data request.
        '402':
          description: Payment required
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    type: string
                    example: Payment required to access this resource.
        '429':
          description: Too many requests
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    type: string
                    example: >-
                      Request rate limit exceeded. Please wait and try again
                      later.
        '500':
          description: Server error
          content:
            application/json:
              schema:
                type: object
                properties:
                  success:
                    type: boolean
                    example: false
                  code:
                    type: string
                    example: UNKNOWN_ERROR
                  error:
                    type: string
                    example: An unexpected error occurred on the server.
      security:
        - bearerAuth: []
components:
  schemas:
    ParseOptions:
      type: object
      description: Optional parse options sent as JSON in the multipart `options` field.
      properties:
        formats:
          $ref: '#/components/schemas/ParseFormats'
        onlyMainContent:
          type: boolean
          description: >-
            Only return the main content of the page excluding headers, navs,
            footers, etc.
          default: true
        includeTags:
          type: array
          items:
            type: string
          description: Tags to include in the output.
        excludeTags:
          type: array
          items:
            type: string
          description: Tags to exclude from the output.
        headers:
          type: object
          description: Headers to send when additional network requests are required.
        timeout:
          type: integer
          description: >-
            Timeout in milliseconds for the request. Default is 30000 (30
            seconds). Maximum is 300000 (300 seconds).
          default: 30000
          maximum: 300000
        parsers:
          type: array
          description: >-
            Controls file parser behavior when relevant (for example PDF parser
            mode).
          items:
            oneOf:
              - type: object
                properties:
                  type:
                    type: string
                    enum:
                      - pdf
                  mode:
                    type: string
                    enum:
                      - fast
                      - auto
                      - ocr
                    default: auto
                    description: >-
                      PDF parsing mode. "fast": text-only extraction. "auto":
                      text-first with OCR fallback. "ocr": OCR on every page.
                  maxPages:
                    type: integer
                    minimum: 1
                    maximum: 10000
                    description: Maximum number of pages to parse from the PDF.
                required:
                  - type
                additionalProperties: false
          default:
            - pdf
        skipTlsVerification:
          type: boolean
          description: Skip TLS certificate verification when making requests.
          default: true
        removeBase64Images:
          type: boolean
          description: >-
            Remove base64-encoded images from output and keep alt text
            placeholders.
          default: true
        blockAds:
          type: boolean
          description: Enable ad and cookie popup blocking.
          default: true
        redactPII:
          oneOf:
            - type: boolean
            - $ref: '#/components/schemas/RedactPIIOptions'
          default: false
          description: >-
            Redact personally identifiable information from returned markdown.
            Pass `true` to use defaults, or an object to tune mode, entities,
            and replacement style.
        proxy:
          type: string
          enum:
            - basic
            - auto
          description: >-
            Proxy mode for parse uploads. `/parse` supports only `basic` and
            `auto`.
        origin:
          type: string
          description: Origin identifier for analytics and logging.
          default: api
        integration:
          type: string
          nullable: true
          description: Optional integration identifier.
        zeroDataRetention:
          type: boolean
          default: false
          description: >-
            If true, this will enable zero data retention for this parse. To
            enable this feature, please contact help@firecrawl.dev
    ScrapeResponse:
      type: object
      properties:
        success:
          type: boolean
        data:
          type: object
          properties:
            markdown:
              type: string
            summary:
              type: string
              nullable: true
              description: Summary of the page if `summary` is in `formats`
            html:
              type: string
              nullable: true
              description: >-
                Cleaned HTML of the page if `html` is in `formats`. Removes
                `<script>`, `<style>`, `<noscript>`, `<meta>`, and `<head>`
                tags; converts relative URLs to absolute; resolves responsive
                image `srcset` to the largest version. Respects
                `onlyMainContent`, `includeTags`, and `excludeTags` filters.
            rawHtml:
              type: string
              nullable: true
              description: >-
                The exact, unmodified HTML as received from the page if
                `rawHtml` is in `formats`. No cleaning or filtering is applied.
            screenshot:
              type: string
              nullable: true
              description: >-
                Screenshot of the page if `screenshot` is in `formats`.
                Screenshots expire after 24 hours and can no longer be
                downloaded.
            audio:
              type: string
              nullable: true
              description: >-
                Signed URL to the extracted MP3 audio file if `audio` is in
                `formats`. The signed URL expires after 1 hour.
            video:
              type: string
              nullable: true
              description: >-
                Signed URL to the extracted video file if `video` is in
                `formats`. The signed URL expires after 1 hour.
            answer:
              type: string
              nullable: true
              description: >-
                Natural-language answer to the question supplied via the
                `question` format. Only present if a `question` format object
                was included in `formats`.
            highlights:
              type: string
              nullable: true
              description: >-
                Relevant source text selected by the `highlights` format. Only
                present if a `highlights` format object was included in
                `formats`.
            links:
              type: array
              items:
                type: string
              description: List of links on the page if `links` is in `formats`
            actions:
              type: object
              nullable: true
              description: >-
                Results of the actions specified in the `actions` parameter.
                Only present if the `actions` parameter was provided in the
                request
              properties:
                screenshots:
                  type: array
                  description: >-
                    Screenshot URLs, in the same order as the screenshot actions
                    provided.
                  items:
                    type: string
                    format: url
                scrapes:
                  type: array
                  description: >-
                    Scrape contents, in the same order as the scrape actions
                    provided.
                  items:
                    type: object
                    properties:
                      url:
                        type: string
                      html:
                        type: string
                javascriptReturns:
                  type: array
                  description: >-
                    JavaScript return values, in the same order as the
                    executeJavascript actions provided.
                  items:
                    type: object
                    properties:
                      type:
                        type: string
                      value: {}
                pdfs:
                  type: array
                  description: >-
                    PDFs generated, in the same order as the pdf actions
                    provided.
                  items:
                    type: string
            metadata:
              type: object
              properties:
                title:
                  oneOf:
                    - type: string
                    - type: array
                      items:
                        type: string
                  description: >-
                    Title extracted from the page, can be a string or array of
                    strings
                description:
                  oneOf:
                    - type: string
                    - type: array
                      items:
                        type: string
                  description: >-
                    Description extracted from the page, can be a string or
                    array of strings
                language:
                  oneOf:
                    - type: string
                    - type: array
                      items:
                        type: string
                  nullable: true
                  description: >-
                    Language extracted from the page, can be a string or array
                    of strings
                sourceURL:
                  type: string
                  format: uri
                  description: >-
                    The original URL that was requested. May differ from the
                    page's final URL if redirects occurred.
                url:
                  type: string
                  format: uri
                  description: >-
                    The final URL of the page after all redirects have been
                    followed.
                keywords:
                  oneOf:
                    - type: string
                    - type: array
                      items:
                        type: string
                  description: >-
                    Keywords extracted from the page, can be a string or array
                    of strings
                ogLocaleAlternate:
                  type: array
                  items:
                    type: string
                  description: Alternative locales for the page
                '<any other metadata> ':
                  oneOf:
                    - type: string
                    - type: array
                      items:
                        type: string
                  description: >-
                    Other metadata extracted from HTML, can be a string or array
                    of strings
                statusCode:
                  type: integer
                  description: The status code of the page
                contentType:
                  type: string
                  description: >-
                    The content type (MIME type) of the page, e.g. text/html,
                    application/pdf
                error:
                  type: string
                  nullable: true
                  description: The error message of the page
                concurrencyLimited:
                  type: boolean
                  description: >-
                    Whether this scrape was throttled due to team concurrency
                    limits
                concurrencyQueueDurationMs:
                  type: number
                  description: >-
                    Time in milliseconds the request waited in the concurrency
                    queue. Only present when concurrencyLimited is true.
            warning:
              type: string
              nullable: true
              description: >-
                Can be displayed when using LLM Extraction. Warning message will
                let you know any issues with the extraction.
            changeTracking:
              type: object
              nullable: true
              description: >-
                Change tracking information if `changeTracking` is in `formats`.
                Only present when the `changeTracking` format is requested.
              properties:
                previousScrapeAt:
                  type: string
                  format: date-time
                  nullable: true
                  description: >-
                    The timestamp of the previous scrape that the current page
                    is being compared against. Null if no previous scrape
                    exists.
                changeStatus:
                  type: string
                  enum:
                    - new
                    - same
                    - changed
                    - removed
                  description: >-
                    The result of the comparison between the two page versions.
                    'new' means this page did not exist before, 'same' means
                    content has not changed, 'changed' means content has
                    changed, 'removed' means the page was removed.
                visibility:
                  type: string
                  enum:
                    - visible
                    - hidden
                  description: >-
                    The visibility of the current page/URL. 'visible' means the
                    URL was discovered through an organic route (links or
                    sitemap), 'hidden' means the URL was discovered through
                    memory from previous crawls.
                diff:
                  type: string
                  nullable: true
                  description: >-
                    Git-style diff of changes when using 'git-diff' mode. Only
                    present when the mode is set to 'git-diff'.
                json:
                  type: object
                  nullable: true
                  description: >-
                    JSON comparison results when using 'json' mode. Only present
                    when the mode is set to 'json'. This will emit a list of all
                    the keys and their values from the `previous` and `current`
                    scrapes based on the type defined in the `schema`. Example
                    [here](/features/change-tracking)
            branding:
              type: object
              nullable: true
              description: >-
                Branding information extracted from the page if `branding` is in
                `formats`. Includes colors, fonts, typography, spacing,
                components, and more.
              properties:
                colorScheme:
                  type: string
                  enum:
                    - light
                    - dark
                  description: The detected color scheme of the page.
                logo:
                  type: string
                  nullable: true
                  description: URL of the primary logo.
                colors:
                  type: object
                  nullable: true
                  description: Brand colors extracted from the page.
                  properties:
                    primary:
                      type: string
                      description: Primary brand color (hex).
                    secondary:
                      type: string
                      description: Secondary brand color (hex).
                    accent:
                      type: string
                      description: Accent color (hex).
                    background:
                      type: string
                      description: Background color (hex).
                    textPrimary:
                      type: string
                      description: Primary text color (hex).
                    textSecondary:
                      type: string
                      description: Secondary text color (hex).
                    link:
                      type: string
                      description: Link color (hex).
                    success:
                      type: string
                      description: Success/positive color (hex).
                    warning:
                      type: string
                      description: Warning color (hex).
                    error:
                      type: string
                      description: Error/danger color (hex).
                fonts:
                  type: array
                  nullable: true
                  description: Array of font families used on the page.
                  items:
                    type: object
                    properties:
                      family:
                        type: string
                        description: Font family name.
                typography:
                  type: object
                  nullable: true
                  description: Detailed typography information.
                  properties:
                    fontFamilies:
                      type: object
                      description: Font families by role.
                      properties:
                        primary:
                          type: string
                          description: Primary font family.
                        heading:
                          type: string
                          description: Heading font family.
                        code:
                          type: string
                          description: Code/monospace font family.
                    fontSizes:
                      type: object
                      description: Font sizes for different text levels.
                      properties:
                        h1:
                          type: string
                        h2:
                          type: string
                        h3:
                          type: string
                        body:
                          type: string
                    fontWeights:
                      type: object
                      description: Font weight definitions.
                      properties:
                        light:
                          type: integer
                        regular:
                          type: integer
                        medium:
                          type: integer
                        bold:
                          type: integer
                    lineHeights:
                      type: object
                      description: Line height values for different text types.
                      properties:
                        heading:
                          type: string
                        body:
                          type: string
                spacing:
                  type: object
                  nullable: true
                  description: Spacing and layout information.
                  properties:
                    baseUnit:
                      type: integer
                      description: Base spacing unit in pixels.
                    borderRadius:
                      type: string
                      description: Default border radius.
                    padding:
                      type: object
                      description: Padding values.
                    margins:
                      type: object
                      description: Margin values.
                components:
                  type: object
                  nullable: true
                  description: UI component styles.
                  properties:
                    buttonPrimary:
                      type: object
                      description: Primary button styles.
                      properties:
                        background:
                          type: string
                        textColor:
                          type: string
                        borderRadius:
                          type: string
                    buttonSecondary:
                      type: object
                      description: Secondary button styles.
                      properties:
                        background:
                          type: string
                        textColor:
                          type: string
                        borderColor:
                          type: string
                        borderRadius:
                          type: string
                    input:
                      type: object
                      description: Input field styles.
                icons:
                  type: object
                  nullable: true
                  description: Icon style information.
                images:
                  type: object
                  nullable: true
                  description: Brand images.
                  properties:
                    logo:
                      type: string
                      description: Logo image URL.
                    favicon:
                      type: string
                      description: Favicon URL.
                    ogImage:
                      type: string
                      description: Open Graph image URL.
                animations:
                  type: object
                  nullable: true
                  description: Animation and transition settings.
                layout:
                  type: object
                  nullable: true
                  description: Layout configuration (grid, header/footer heights).
                personality:
                  type: object
                  nullable: true
                  description: Brand personality traits (tone, energy, target audience).
            product:
              type: object
              nullable: true
              description: >-
                Product information extracted from the page if `product` is in
                `formats`. Includes title, brand, category, description, and
                variants. Pricing, availability, and images live on each
                variant.
              properties:
                title:
                  type: string
                  description: The product title.
                brand:
                  type: string
                  description: The product brand or manufacturer.
                category:
                  type: string
                  description: >-
                    The product category, optionally as a breadcrumb path (e.g.
                    'Electronics > Audio > Headphones').
                url:
                  type: string
                  description: The canonical URL of the product page.
                description:
                  type: string
                  description: The product description.
                variants:
                  type: array
                  description: Product variants (e.g. different colors or sizes).
                  items:
                    type: object
                    properties:
                      id:
                        type: string
                        description: The variant identifier.
                      sku:
                        type: string
                        description: The variant SKU.
                      title:
                        type: string
                        description: The variant title.
                      values:
                        type: object
                        description: 'The variant option values (e.g. { "color": "Black" }).'
                        additionalProperties:
                          type: string
                      price:
                        type: object
                        description: The current price of the variant.
                        properties:
                          amount:
                            type: number
                            description: The numeric price amount.
                          currency:
                            type: string
                            description: The ISO 4217 currency code (e.g. 'USD').
                          formatted:
                            type: string
                            description: The price formatted for display (e.g. '$199.99').
                        required:
                          - amount
                      sale:
                        type: object
                        description: >-
                          Sale/discount information for the variant, present
                          when the variant is discounted.
                        properties:
                          originalPrice:
                            type: object
                            description: The original (pre-discount) price of the variant.
                            properties:
                              amount:
                                type: number
                                description: The numeric price amount.
                              currency:
                                type: string
                                description: The ISO 4217 currency code (e.g. 'USD').
                              formatted:
                                type: string
                                description: >-
                                  The price formatted for display (e.g.
                                  '$249.99').
                            required:
                              - amount
                        required:
                          - originalPrice
                      availability:
                        type: object
                        description: >-
                          The availability of the variant. Always present on a
                          variant.
                        properties:
                          inStock:
                            type: boolean
                            description: Whether the variant is in stock.
                          text:
                            type: string
                            description: >-
                              Human-readable availability text (e.g. 'In
                              Stock').
                        required:
                          - inStock
                      images:
                        type: array
                        description: Variant images.
                        items:
                          type: object
                          properties:
                            url:
                              type: string
                              description: Image URL.
                            alt:
                              type: string
                              description: Alternative text for the image.
                          required:
                            - url
                    required:
                      - availability
              required:
                - title
                - url
                - variants
            menu:
              type: object
              nullable: true
              description: >-
                Menu information extracted from the page if `menu` is in
                `formats`. Includes the merchant, currency, and a list of
                sections, where each section carries items with description,
                images, price, availability, dietary tags, calories, and option
                groups.
              properties:
                isMenu:
                  type: boolean
                  description: Whether the page was identified as a menu.
                confidence:
                  type: number
                  description: A confidence score between 0 and 1 for the menu extraction.
                merchant:
                  type: object
                  description: The merchant the menu belongs to.
                  properties:
                    name:
                      type: string
                      description: The merchant name.
                    type:
                      type: string
                      description: The merchant type (e.g. 'restaurant').
                  required:
                    - name
                currency:
                  type: string
                  description: >-
                    The ISO 4217 currency code for the menu (e.g. 'USD'),
                    reported only when the page sources it.
                sections:
                  type: array
                  description: Menu sections (e.g. 'Appetizers', 'Entrees').
                  items:
                    type: object
                    properties:
                      id:
                        type: string
                        description: The section identifier.
                      name:
                        type: string
                        description: The section name.
                      description:
                        type: string
                        nullable: true
                        description: The section description.
                      items:
                        type: array
                        description: The items in the section.
                        items:
                          type: object
                          properties:
                            id:
                              type: string
                              description: The item identifier.
                            name:
                              type: string
                              description: The item name.
                            description:
                              type: string
                              nullable: true
                              description: The item description.
                            images:
                              type: array
                              description: Item images.
                              items:
                                type: object
                                properties:
                                  url:
                                    type: string
                                    description: Image URL.
                                  alt:
                                    type: string
                                    nullable: true
                                    description: Alternative text for the image.
                                required:
                                  - url
                            price:
                              type: object
                              description: The price of the item.
                              properties:
                                amount:
                                  type: number
                                  description: The numeric price amount.
                                currency:
                                  type: string
                                  description: The ISO 4217 currency code (e.g. 'USD').
                                formatted:
                                  type: string
                                  description: >-
                                    The price formatted for display (e.g.
                                    '$7.99').
                              required:
                                - amount
                            availability:
                              type: object
                              description: The availability of the item.
                              properties:
                                inStock:
                                  type: boolean
                                  description: Whether the item is available.
                                text:
                                  type: string
                                  nullable: true
                                  description: Human-readable availability text.
                              required:
                                - inStock
                            dietary:
                              type: array
                              description: Dietary tags for the item (e.g. ['vegetarian']).
                              items:
                                type: string
                            calories:
                              type: number
                              nullable: true
                              description: The item's calorie count.
                            optionGroups:
                              type: array
                              description: Option/modifier groups for the item.
                              items:
                                type: object
                            identifiers:
                              type: object
                              description: Merchant-specific identifiers for the item.
                              properties:
                                merchantItemId:
                                  type: string
                                  description: The merchant's own item ID.
                            url:
                              type: string
                              nullable: true
                              description: The canonical URL of the item.
                            sourceUrl:
                              type: string
                              nullable: true
                              description: The URL the item was extracted from.
                          required:
                            - name
                    required:
                      - name
                      - items
                sourceUrl:
                  type: string
                  nullable: true
                  description: The URL the menu was extracted from.
              required:
                - isMenu
                - sections
    ParseFormats:
      type: array
      items:
        oneOf:
          - type: object
            title: Markdown
            properties:
              type:
                type: string
                enum:
                  - markdown
            required:
              - type
          - type: object
            title: Summary
            properties:
              type:
                type: string
                enum:
                  - summary
            required:
              - type
          - type: object
            title: HTML
            properties:
              type:
                type: string
                enum:
                  - html
            required:
              - type
          - type: object
            title: Raw HTML
            properties:
              type:
                type: string
                enum:
                  - rawHtml
            required:
              - type
          - type: object
            title: Links
            properties:
              type:
                type: string
                enum:
                  - links
            required:
              - type
          - type: object
            title: Images
            properties:
              type:
                type: string
                enum:
                  - images
            required:
              - type
          - type: object
            title: JSON
            properties:
              type:
                type: string
                enum:
                  - json
              schema:
                type: object
                description: >-
                  The schema to use for the JSON output. Must conform to [JSON
                  Schema](https://json-schema.org/).
              prompt:
                type: string
                description: The prompt to use for the JSON output
            required:
              - type
      description: >-
        Output formats supported for `/parse` uploads. Browser-rendering formats
        and change tracking are not supported.
      default:
        - markdown
    RedactPIIOptions:
      type: object
      description: Tuning options for PII redaction.
      properties:
        mode:
          type: string
          enum:
            - accurate
            - aggressive
            - fast
          default: accurate
          description: >-
            Redaction strategy. `accurate` is model-only and optimized for
            precision, `aggressive` increases recall with additional heuristics,
            and `fast` uses heuristics without the model call.
        entities:
          type: array
          description: >-
            Restrict redaction to these entity buckets. If omitted, all
            supported entities are redacted.
          items:
            $ref: '#/components/schemas/RedactPIIEntity'
        replaceStyle:
          type: string
          enum:
            - tag
            - mask
            - remove
          default: tag
          description: >-
            `tag` replaces spans with placeholders like `<EMAIL>`, `mask`
            replaces characters with `*`, and `remove` deletes the span text.
      additionalProperties: false
    RedactPIIEntity:
      type: string
      enum:
        - PERSON
        - EMAIL
        - PHONE
        - LOCATION
        - FINANCIAL
        - SECRET
      description: Public PII entity buckets supported by Firecrawl redaction.
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer

````