blob: 247ab6925e42e5a56a33c35bd759011e4ce6a2a8 (
plain) (
blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
|
{{render "license/shields" . "License" "MIT"}}
{{template "badge/godoc" .}}
{{template "badge/goreport" .}}
{{template "badge/travis" .}}
[](http://godoc.org/github.com/go-easygen/wireframe)
# {{toc 5}}
# {{.Name}} - HTML to Markdown converter
The `{{.Name}}` makes use of `github.com/JohannesKaufmann/html-to-markdown`
to convert HTML into Markdown, which is using an [HTML Parser](https://github.com/PuerkitoBio/goquery) to avoid the use of `regexp` as much as possible, which can prevent some [weird cases](https://stackoverflow.com/a/1732454) and allows it to be used for cases where the input is totally unknown.

# Usage
### $ {{exec "html2md" | color "sh"}}
# Examples
## Simplest form
```md
$ html2md -i https://github.com/suntong/html2md | head -3
[Skip to content](#start-of-content)
[Homepage](https://github.com/)
```
## Using goquery
The most useful feature is to use and pass a [goquery](https://github.com/PuerkitoBio/goquery) selection to filter for the content you want.
```md
$ {{.Name}} -i https://github.com/JohannesKaufmann/html-to-markdown -s "div.BorderGrid-row.hide-sm.hide-md > div"
```
## The options and plugins
Works as expected:
```sh
$ echo '<strong>Bold Text</strong>' | html2md -i
**Bold Text**
$ echo '<strong>Bold Text</strong>' | html2md -i --opt-strong-delimiter="__"
__Bold Text__
$ echo '<ul><li><input type=checkbox checked>Checked!</li><li><input type=checkbox>Check Me!</li></ul>' | html2md -i -G
- [x] Checked!
- [ ] Check Me!
$ echo 'Only <del>blue ones</del> <s> left</s>' | html2md -i --plugin-strikethrough
Only ~blue ones~ ~left~
```
# Debian package
Will be available once `github.com/JohannesKaufmann/html-to-markdown` has a release version.
# Install Source
To install the source code instead:
```
go get github.com/suntong/{{.Name}}
```
## Credits
- [Johannes Kaufmann's html-to-markdown](github.com/JohannesKaufmann/html-to-markdown) that does the heavy lifting behind the scene.
## Similar Projects
- [turndown (js)](https://github.com/domchristie/turndown), a very good library written in javascript.
- [lunny/html2md](https://github.com/lunny/html2md), which is using [regex instead of goquery](https://stackoverflow.com/a/1732454), which exhibits a few edge cases which prompted `github.com/JohannesKaufmann/html-to-markdown`
- [jaytaylor/html2text](https://github.com/jaytaylor/html2text), which is not converting to markdown but plain text.
## Author(s) & Contributor(s)
Tong SUN

_Powered by_ [**WireFrame**](https://github.com/go-easygen/wireframe), [](http://godoc.org/github.com/go-easygen/wireframe), the _one-stop wire-framing solution_ for Go cli based projects, from start to deploy.
All patches welcome.
|