# html2md [](#contributors-) [](LICENSE) [](http://godoc.org/github.com/suntong/html2md) [](https://goreportcard.com/report/github.com/suntong/html2md) [](https://github.com/suntong/html2md/actions/workflows/go-release-build.yml) [](http://godoc.org/github.com/go-easygen/wireframe) ## TOC - [html2md - HTML to Markdown converter](#html2md---html-to-markdown-converter) - [Usage](#usage) - [$ html2md](#-html2md) - [Examples](#examples) - [Simplest form](#simplest-form) - [Using goquery](#using-goquery) - [The options and plugins](#the-options-and-plugins) - [Testing the new table plugins](#testing-the-new-table-plugins) - [Credits](#credits) - [Credits](#credits-1) - [Similar Projects](#similar-projects) - [Install Debian/Ubuntu package](#install-debianubuntu-package) - [Download/install binaries](#downloadinstall-binaries) - [The binary executables](#the-binary-executables) - [Distro package](#distro-package) - [Debian package](#debian-package) - [Install Source](#install-source) - [Author](#author) - [Contributors](#contributors-) ## html2md - HTML to Markdown converter The `html2md` makes use of https://github.com/JohannesKaufmann/html-to-markdown to convert HTML into Markdown, which is using an [HTML Parser](https://github.com/PuerkitoBio/goquery) to avoid the use of `regexp` as much as possible, which can prevent some [weird cases](https://stackoverflow.com/a/1732454) and allows it to be used for cases where the input is totally unknown.  ## Usage ### $ html2md ```sh HTML to Markdown Version 1.5.0 built on 2024-02-10 Copyright (C) 2020-2024, Tong Sun HTML to Markdown converter on command line Usage: html2md [Options...] Options: -h, --help display help information -i, --in *The html/xml file to read from (or stdin) -d, --domain Domain of the web page, needed for links when --in is not url -s, --sel CSS/goquery selectors [=body] -x, --excl Excluding CSS/goquery selectors --xc Excluding all children nodes -v, --verbose Verbose mode (Multiple -v options increase the verbosity.) --opt-heading-style Option HeadingStyle --opt-horizontal-rule Option HorizontalRule --opt-bullet-list-marker Option BulletListMarker --opt-code-block-style Option CodeBlockStyle --opt-fence Option Fence --opt-em-delimiter Option EmDelimiter --opt-strong-delimiter Option StrongDelimiter --opt-link-style Option LinkStyle --opt-link-reference-style Option LinkReferenceStyle --opt-escape-mode Option EscapeMode --plugin-br-to-newline Plugin BrToNewline -A, --plugin-conf-attachment Plugin ConfluenceAttachments -C, --plugin-conf-code Plugin ConfluenceCodeBlock -F, --plugin-frontmatter Plugin FrontMatter -G, --plugin-gfm Plugin GitHubFlavored -S, --plugin-strikethrough Plugin Strikethrough -T, --plugin-table Plugin Table --plugin-table-compat Plugin TableCompat -L, --plugin-task-list Plugin TaskListItems -V, --plugin-vimeo Plugin VimeoEmbed -Y, --plugin-youtube Plugin YoutubeEmbed ``` ### Examples #### Simplest form ```md $ html2md -i https://github.com/suntong/html2md | head -3 [Skip to content](#start-of-content) [Homepage](https://github.com/) ``` #### Using goquery The most useful feature is to use and pass a [goquery](https://github.com/PuerkitoBio/goquery) selection to filter for the content you want. ```md $ html2md -i https://github.com/JohannesKaufmann/html-to-markdown -s "div.my-3" [go](http://github.com/topics/go "Topic: go") [html](http://github.com/topics/html "Topic: html") [markdown](http://github.com/topics/markdown "Topic: markdown") [golang](http://github.com/topics/golang "Topic: golang") [converter](http://github.com/topics/converter "Topic: converter") [html-to-markdown](http://github.com/topics/html-to-markdown "Topic: html-to-markdown") [goquery](http://github.com/topics/goquery "Topic: goquery") ``` ### The options and plugins Works as expected: ```sh $ echo 'Bold Text' | html2md -i **Bold Text** $ echo 'Bold Text' | html2md -i --opt-strong-delimiter="__" __Bold Text__ $ echo '
Lorem Ipsum:
' | ./html2md -i --plugin-youtube Lorem Ipsum: [](https://www.youtube.com/watch?v=PifPVQOFyZI) ``` #### Testing the new table plugins ```sh $ cat $GOPATH/src/github.com/JohannesKaufmann/html-to-markdown/testdata/TestPlugins/table/input.html | html2md -i -T | head -6 | Firstname | Lastname | Age | | --- | --- | --- | | Jill | Smith | 50 | | Eve | Jackson | 94 | | Empty | | | | End | $ cat $GOPATH/src/github.com/JohannesKaufmann/html-to-markdown/testdata/TestPlugins/table/input.html | html2md -i -T --domain example.com | diff -wU 1 $GOPATH/src/github.com/JohannesKaufmann/html-to-markdown/testdata/TestPlugins/table/output.table.golden - --- @@ -41 +41,2 @@ | `var` | b | c | \ No newline at end of file + $ cat $GOPATH/src/github.com/JohannesKaufmann/html-to-markdown/testdata/TestPlugins/table/input.html | html2md -i --plugin-table-compat | head -6 Firstname · Lastname · Age Jill · Smith · 50 Eve · Jackson · 94 $ cat $GOPATH/src/github.com/JohannesKaufmann/html-to-markdown/testdata/TestPlugins/table/input.html | html2md -i --plugin-table-compat --domain example.com | diff -wU 1 $GOPATH/src/github.com/JohannesKaufmann/html-to-markdown/testdata/TestPlugins/table/output.tablecompat.golden - --- @@ -41 +41,2 @@ `var` · b · c \ No newline at end of file + ``` ## Credits ### Credits - [Johannes Kaufmann's html-to-markdown](https://github.com/JohannesKaufmann/html-to-markdown) that does the heavy lifting behind the scene. ### Similar Projects - [turndown (js)](https://github.com/domchristie/turndown), a very good library written in javascript. - [lunny/html2md](https://github.com/lunny/html2md), which is using [regex instead of goquery](https://stackoverflow.com/a/1732454), which exhibits a few edge cases which prompted `github.com/JohannesKaufmann/html-to-markdown` - [jaytaylor/html2text](https://github.com/jaytaylor/html2text), which is not converting to markdown but plain text. ## Install Debian/Ubuntu package sudo apt install -y html2md ## Download/install binaries - The latest binary executables are available as the result of the Continuous-Integration (CI) process. - I.e., they are built automatically right from the source code at every git release by [GitHub Actions](https://docs.github.com/en/actions). - There are two ways to get/install such binary executables * Using the **binary executables** directly, or * Using **packages** for your distro ### The binary executables - The latest binary executables are directly available under https://github.com/suntong/html2md/releases/latest - Pick & choose the one that suits your OS and its architecture. E.g., for Linux, it would be the `html2md_verxx_linux_amd64.tar.gz` file. - Available OS for binary executables are * Linux * Mac OS (darwin) * Windows - If your OS and its architecture is not available in the download list, please let me know and I'll add it. - The manual installation is just to unpack it and move/copy the binary executable to somewhere in `PATH`. For example, ``` sh tar -xvf html2md_*_linux_amd64.tar.gz sudo mv -v html2md_*_linux_amd64/html2md /usr/local/bin/ rmdir -v html2md_*_linux_amd64 ``` ### Distro package - [Packages available for Linux distros](https://cloudsmith.io/~suntong/repos/repo/packages/) are * [Alpine Linux](https://cloudsmith.io/~suntong/repos/repo/setup/#formats-alpine) * [Debian](https://cloudsmith.io/~suntong/repos/repo/setup/#formats-deb) * [RedHat](https://cloudsmith.io/~suntong/repos/repo/setup/#formats-rpm) The repo setup instruction url has been given above. For example, for [Debian](https://cloudsmith.io/~suntong/repos/repo/setup/#formats-deb) -- ### Debian package ```sh curl -1sLf \ 'https://dl.cloudsmith.io/public/suntong/repo/setup.deb.sh' \ | sudo -E bash # That's it. You then can do your normal operations, like sudo apt update apt-cache policy html2md sudo apt install -y html2md ``` ## Install Source To install the source code instead: ``` go install github.com/suntong/html2md@latest ``` ## Author Tong SUN  _Powered by_ [**WireFrame**](https://github.com/go-easygen/wireframe) [](http://godoc.org/github.com/go-easygen/wireframe) the _one-stop wire-framing solution_ for Go cli based projects, from _init_ to _deploy_. ## Contributors ✨ Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):suntong 💻 🤔 🎨 🔣 ⚠️ 🐛 📖 📝 💡 ✅ 🔧 📦 👀 💬 🚧 🚇 |
VPanteleev-S7 💻 🐛 📓 |
itdoginfo 🐛 📓 |
somename123 🐛 🤔 📓 |
vivook 🐛 📓 |
097115 🐛 🤔 📓 |
James Reynolds 👀 📢 📓 |
ImportTaste 💻 🐛 📓 |