htmlToXlsx(filename, [html]) Last updated: 13. May 2026

API only function

Converts an HTML table to an Excel file (XLSX).

Parameters

Name Type Description
filename string

Specify output filename.

html (optional) string

A string containing a table to convert from HTML to an Excel file (XLSX).
If not specified, the current output buffer (using the write method) will be used as table content.

Returns

An Excel file (XLSX) in bytes.

Supports multiple worksheets, frozen headers (thead), colspan/rowspan merging, base64 images, inline styling (background-color, color, text-align, vertical-align, borders), text formatting (bold, italic, underline, hyperlinks), automatic numeric typing, column widths, row heights, text wrap control, font styling, Excel number-formats with automatic value coercion, and Excel tables (ListObjects) with auto-filter and banded rows.

Function signature

htmlToXlsx(filename, html?) — if html is omitted, the runtime output buffer is used as input.

Structural elements

  • <table title="..."> — sets the worksheet name. Without title, "Sheet1", "Sheet2", ... is used.
  • <thead> — rows in thead are automatically frozen in Excel (Freeze Panes).
  • <tr> — one row
  • <td> / <th> — one cell. <th> automatically gets bold text.
  • colspan / rowspan<td colspan="N"> / <td rowspan="M"> merges via Excel Merge. Subsequent rows automatically skip columns occupied by an ongoing rowspan.
  • <colgroup><col> — declares per-column properties (width, format) before the first <tr>. Use span="N" to apply the same column to multiple columns.

Cell content (priority)

  1. Image cell — ONE <img> with base64 data-URL and only whitespace otherwise
  2. Styling peel — content wrapped in <b>/<strong>/<i>/<em>/<u> and/or <a href> (combinable, nestable)
  3. Text cell — everything else is displayed as text

For text cells:

  • <br> / <br/> / <br /> (case-insensitive) → line break, WrapText is enabled automatically
  • Numeric strings → Excel numbers (sum/sort/filter works)
  • <th> → bold

Image support

  • Only base64-encoded data-URLs: data:image/<type>;base64,<data>
  • Formats: PNG, JPEG, GIF, BMP
  • HTTP/HTTPS is intentionally rejected
  • ONE image per cell, only whitespace otherwise
  • Size: width/height attributes → style="width:Xpx;height:Ypx" → natural pixel size
  • Row height adapts to the image automatically

Column widths

Column widths can be set on <col> (preferred — applies to entire column) or on the first <td>/<th> in a column (legacy fallback). Subsequent rows are ignored once the column is seeded. Cells with colspan > 1 never seed column widths (ambiguous distribution).

<table>
  <colgroup>
    <col style="width: 140px">
    <col style="min-width: 20; max-width: 60">
    <col width="30">
  </colgroup>
  ...
</table>
  • width — fixed column width (overrides auto-fit entirely)
  • min-width / max-width — clamps auto-fit between these bounds (default is 20/80 chars)
  • Units: px, pt, em/rem (1em = 16px), in, cm, mm, ch. Naked numbers in width="N" follow HTML 4.01 convention and mean characters; naked numbers in style="width: N" also mean characters in this column-width context. Percent and auto are ignored.
  • Conversion: 1 char ≈ 7px @ 96 DPI (Calibri 11pt), 1px = 0.75pt

Row heights

Set via <tr height="..."> or <tr style="height: ...">. Naked numbers in the height attribute are pixels (HTML convention). Image cells may still raise the row higher if their image is taller — we take the maximum so explicit heights are never lost.

<tr height="40"><td>...</td></tr>
<tr style="height: 30pt"><td>...</td></tr>

Text wrap control

Multi-line content (containing <br> or \n) automatically enables WrapText. Override via white-space:

  • white-space: normal / pre-wrap / pre-line / break-spaces — force wrap (useful for long single-line strings)
  • white-space: nowrap / pre — disable wrap

Font styling

Via inline style on <td>/<th>:

  • font-size — accepts pt, px, em/rem, etc. (1px = 0.75pt). E.g. font-size: 14pt or font-size: 18px.
  • font-family — first named family is used. Generic families (serif, sans-serif, monospace, cursive, fantasy, system-ui) are skipped — they have no Excel equivalent. E.g. font-family: "Consolas", monospace becomes Consolas.
  • font-weight: bold or font-weight: 600 (or higher) → Bold (same as <b>)
  • font-style: italic or font-style: oblique → Italic (same as <i>)

Border support

Via inline style on <td>/<th>:

  • border — shorthand: [width] [style] [color]
  • border-top / border-right / border-bottom / border-left — per-side shorthand
  • Width: 1px/thin, 2px/medium, 3px+/thick
  • Style: solid, dashed, dotted, double, none
  • Color: all CSS formats (hex, names, rgb())

<table border="1"> enables thin black border on all cells. Inline border overrides.

For merged cells (colspan/rowspan) the border is applied as outer perimeter — bottom/right edges are on the cells in the last row/column of the merge area.

Text styling

  • <b> / <strong> → Bold
  • <i> / <em> → Italic
  • <u> → Underline
  • <a href="..."> → Cell gets hyperlink
  • <th> → Bold (automatic)

Can be nested as long as the ENTIRE cell content is wrapped. Mixed content (e.g. Hello <b>world</b>) is displayed as literal HTML.

Cell styling via style attribute

  • background-color — cell background (on merged cells: entire range)
  • color — text color
  • text-align — left, center, right, justify, start, end
  • vertical-align — top, middle, center, bottom
  • white-space — see Text wrap control
  • font-size / font-family / font-weight / font-style — see Font styling
  • width / min-width / max-width — see Column widths
  • height — on <tr>, see Row heights
  • border + per-side — see border section

Cell value formatting (data-format)

Apply Excel number-formats and automatically coerce cell values to the matching CLR type. Set on <col> for column-wide format, or on <td>/<th> to override per cell.

<table>
  <colgroup>
    <col data-format="yyyy-mm-dd">
    <col data-format="#,##0.00">
    <col data-format="0%">
  </colgroup>
  <tr>
    <td>2026-05-13</td>          <!-- → DateTime, displayed as 2026-05-13 -->
    <td>1234.56</td>              <!-- → Number, displayed as 1,234.56 -->
    <td>50%</td>                  <!-- → 0.5, displayed as 50% -->
  </tr>
  <tr>
    <td data-format="dd.mm.yyyy">2026-06-15</td>  <!-- per-cell override -->
    <td>2,500</td>
    <td>0.25</td>                 <!-- → 0.25, displayed as 25% -->
  </tr>
</table>

Format syntax

Follows Excel's number-format syntax (Microsoft "Number format codes") — NOT .NET DateTime format strings:

  • yyyy — 4-digit year
  • mm — 2-digit month (lowercase, next to y or d). In .NET this would be MM.
  • dd — 2-digit day
  • hh — 2-digit hour (24h)
  • mm — 2-digit minute (when next to h or s)
  • ss — 2-digit second
  • 0 / # — digit placeholders (0 keeps trailing zeros, # omits)
  • . / , — decimal/thousands separator (display)
  • % — percent (Excel multiplies stored value by 100 for display)
  • @ — text placeholder (force string rendering)

Common patterns:

  • Date: yyyy-mm-dd, dd.mm.yyyy, mmm yyyy
  • DateTime: yyyy-mm-dd hh:mm:ss
  • Time: hh:mm:ss, hh:mm
  • Number: 0.00, #,##0.00, #,##0
  • Percent: 0%, 0.00%
  • Currency: "$"#,##0.00, #,##0.00" kr"
  • Text: @ (forces string rendering, preserves leading zeros, prevents numeric conversion)

Value coercion

When a data-format is present, the cell text is coerced to the matching CLR type before the format is applied:

  • Date format (contains y, d, h, or s): parses with InvariantCulture first (ISO 2026-05-13), then falls back to CurrentCulture (e.g. 13.05.2026 on nb-NO). Combined date+time strings are supported.
  • Percent format (contains %): value ending in % is divided by 100 (50%0.5). Values without % are assumed to already be ratios (0.5 → displays as 50%).
  • Number format (contains # or 0): parses with InvariantCulture, accepts thousands separators. Explicit number format overrides the leading-zero protection — <td data-format="0">001234</td> becomes the number 1234.
  • Text format (@): always written as string, preserving leading zeros and any digit-like input (typical for phone numbers, ZIP codes, customer IDs).

If the value cannot be coerced (e.g. "N/A" with date format), the cell falls back to plain text and the format is not applied.

Excel tables (data-table)

Convert a <table> into an Excel ListObject by adding the data-table attribute with a name. The resulting file gets auto-filter on the header row, banded row styling, and a named range usable in formulas.

<table data-table="Orders">
  <thead><tr><th>Date</th><th>Customer</th><th>Amount</th></tr></thead>
  <tbody>
    <tr><td>2026-05-13</td><td>Acme</td><td>1200</td></tr>
    <tr><td>2026-05-14</td><td>Globex</td><td>980</td></tr>
  </tbody>
</table>

Name rules

Excel enforces strict rules on table names. We sanitize lightly and skip table creation if the name cannot be made valid:

  • Must start with a letter or underscore — names starting with a digit are rejected.
  • Allowed characters: letters, digits, _, ..
  • Spaces are converted to underscores ("Sales Orders"Sales_Orders).
  • Reserved single letters C, R (and lowercase) are rejected — Excel uses them as row/column references.
  • Max length 255 characters (truncated silently).
  • Names must be unique across the workbook. Duplicates get an auto-incremented suffix (Orders, Orders2, Orders3, ...).

If the name is invalid, the worksheet is generated normally but without the ListObject — no exception is thrown.

Range

The table spans from A1 to the last filled cell. The first row is treated as the header row (typically your <thead>).

Compatibility

  • Combinable with data-format — column-level number formats are preserved when the table is created.
  • Combinable with frozen headers<thead> still freezes as before.
  • NOT compatible with colspan/rowspan — Excel tables forbid merged cells. If any merged range exists in the table area, table creation is silently skipped and you get a plain formatted range instead.

Automatic numeric typing (without data-format)

When no data-format is specified:

  • InvariantCulture: 1234, 3.14, -0.5 → number
  • Norwegian decimals (1,5) remain text — use 1.5 or specify data-format="#,##0.00"
  • Leading zeros are preserved as text (ZIP, phone, ID): "0001234" remains text — use data-format="@" or data-format="0" to override
  • Text with letters remains text

Known limitations

  • Images: only base64, only ONE per cell, only whitespace otherwise
  • Mixed text+markup is displayed as literal HTML
  • Column width is not automatically adjusted for image cells
  • Comma decimals are not treated as numbers (without explicit data-format)
  • Assumes <thead> comes before <tbody>
  • data-format with only the m token is ambiguous (month vs minute) — use mmm or combine with y/d/h/s to disambiguate
  • data-table is skipped if the range contains merged cells (colspan/rowspan) — Excel does not allow merged cells inside a ListObject

Example

Code example (JS)

JS is normal JavaScript either running in the browser or on the Docly™ server.
// Example: Table with column widths, row heights, data formats, styling and Excel table (data-table)
write('<table border="1" title="Sales" data-table="Sales">');
write('<colgroup>');
write('  <col style="width: 30">');
write('  <col data-format="yyyy-mm-dd" style="width: 16">');
write('  <col data-format="#,##0.00" style="width: 14">');
write('  <col data-format="0%" style="width: 10">');
write('</colgroup>');
write('<thead>');
write('<tr height="30" style="background-color:#4472C4;color:white">');
write('<th>Customer</th><th>Order date</th><th>Amount</th><th>Discount</th>');
write('</tr>');
write('</thead>');
write('<tbody>');
write('<tr><td>Acme Corp</td><td>2026-05-13</td><td>12500.50</td><td>10%</td></tr>');
write('<tr><td>Globex</td><td>2026-05-14</td><td>8750.00</td><td>0.05</td></tr>');
write('</tbody>');
write('</table>');

return docly.htmlToXlsx("sales.xlsx");