Defuddle is a tool designed to extract the main content from web pages by removing unnecessary elements like comments, sidebars, and headers. It creates clean HTML documents suitable for HTML-to-Markdown conversion and was intended for use with the Obsidian Web Clipper. Defuddle can serve as a replacement for Mozilla Readability, provides consistent outputs for various elements, and extracts metadata using schema.org. Installation requires npm and for Node.js, JSDOM. The tool is available in core, full, and Node.js bundles and offers configurable options for parsing and content manipulation.

5m read timeFrom github.com
Post cover image
Table of contents
FeaturesInstallationUsageResponseBundlesOptionsHTML standardizationDevelopment

Sort: