Gwtar is a new HTML archival format that solves the trilemma of being static (self-contained), single-file, and efficient (lazy-loading) simultaneously. It works by creating a polyglot file: an HTML+JavaScript header followed by a tarball of assets. The JavaScript uses `window.stop()` to halt initial loading, then serves assets via HTTP range requests into the embedded tarball. This allows archiving even gigabyte-sized web pages as a single file that only downloads assets as needed, without requiring special server software or future compatibility concerns.

21m read timeFrom gwern.net
Post cover image
Table of contents
BackgroundHTML TrilemmaTrisectingConcatenated Archive DesignMetadataIPFurther WorkBibliography
1 Comment

Sort: