← Blog

PNG compression — two passes that each do half the work

PNG compression isn't one algorithm — it's two stages stacked. First, the image is reduced to a palette of representative colours (the lossy step, optional). Second, the palette-indexed pixel grid is encoded with deflate using per-row prediction filters (the lossless step, mandatory). Each stage handles a different kind of redundancy. Skipping either one leaves a substantial size penalty.

Pass 1: pick a palette

Most images that PNG is the right format for — screenshots, logos, UI mockups, line art — use far fewer than 16.7 million distinct colours. Often fewer than 256. A truecolor PNG of such an image stores 24 bits (3 bytes) per pixel; the same image as a palette PNG stores 8 bits (1 byte) per pixel plus a 768-byte palette. For a 1920 × 1080 screenshot that's the difference between 5.9 MB and 2.0 MB before deflate runs.

The pass-1 quantizer takes the full-colour pixel grid and the desired colour count (≤256). It finds a set of representative colours that minimises the perceived error against the original, then maps each pixel to the nearest palette entry. The output is two arrays: a 256-entry palette (768 bytes for 24-bit RGB plus optional alpha) and a width × height array of 8-bit indices into that palette.

Quantization is lossy. Most pixels round to a palette colour very close to their original; some — colours that are rare and unrepresented — visibly shift. Whether the shift is acceptable depends on the colour count and the content. We control colour count via SSIM-bounded search (see the SSIM binary search article).

Pass 2: optimise the deflate

Once the image is palette-indexed (or kept truecolor for content that doesn't tolerate palette quantization), the second pass is a multi-pass re-encoder of the PNG file itself.

PNG bodies are deflate-compressed streams of pre-processed pixel rows. Each row gets a 1-byte filter selector before its data; the filter decides whether the row is stored as raw bytes, as differences from the left neighbour, as differences from the row above, or one of two mixed predictors. The right filter for each row depends on the content; the wrong one inflates the deflate output.

A naïve PNG encoder picks one filter and uses it for every row. A multi-pass optimiser tries all five filters per row, plus several deflate strategy variants (compression level, window size, lazy matching), and keeps the smallest output.

The savings are 5–25% on synthetic content (screenshots, UI), with smaller gains on photographic content where most pixels carry random-looking high-frequency detail that no filter can predict effectively.

Pass 2 is always lossless

The second pass never alters pixel values. It only chooses different ways to encode the same pixel grid. The decoded image after pass 2 is bit-identical to the input that pass 1 produced. If pass 1 didn't run (the image stayed truecolor), pass 2 still works on the truecolor pixels and produces a bit-perfect truecolor PNG.

This is why the pipeline can advertise "lossless compression" even when palette quantization is happening. The lossless guarantee applies to the deflate stage; the palette stage is the one users opt into when they accept some colour shift in exchange for size.

When we skip pass 1

If the input is already a palette PNG with fewer than 256 colours, there's nothing to gain from re-quantizing. We detect this by reading the PNG's PLTE chunk and looking at its length: a PLTE under 768 bytes means a sub-256-colour palette, and we skip the expensive decode/encode entirely, returning the input bytes unchanged.

If the input is a truecolor PNG with millions of distinct pixel values (a photographic PNG), palette quantization usually produces noticeable colour shift. In that case the right answer is to switch formats — JPEG or WebP — not to keep wrestling with PNG. Our compressor honours the user's choice; if you uploaded a PNG and asked for a smaller PNG, we keep the truecolor and optimise pass 2 only.

Exit when savings are tiny

For some images, neither pass produces meaningful savings. Pre-optimised PNGs (already palette-quantized, already filter-optimised) can come out within 1–2% of the input size. Re-encoding such a file produces a "compressed" output that's almost the same as the input but written by a different tool — different metadata, different chunk ordering — for no benefit.

The compressor compares output to input size after both passes. If the savings are below a small threshold (~3%), we hand back the original file with a note that it's already well-compressed. This keeps file modification time, exact bytes, and any embedded metadata intact when re-compression has nothing to add.

What ends up in the output PNG

A typical compressed-PNG output contains:

PNG signature — 8 bytes.
IHDR chunk — width, height, bit depth, colour type. Bit depth is 8 (palette mode) or 8 per channel (truecolor).
PLTE chunk — palette table, present in palette mode.
tRNS chunk — per-palette-entry alpha, if any palette entry is non-opaque.
IDAT chunks — the deflate-compressed filtered row stream. Multiple chunks for very large images.
IEND chunk — end-of-file marker.

We strip ancillary chunks (tEXt, zTXt, tIME, eXIf, etc.) by default. None are required for the image to render correctly, and they add 1–10 KB without affecting visible content.