Convert Docx to Markdown
I needed to convert a Docx file to Markdown, but Pandoc kept giving me this obnoxious error:
$ pandoc test.docx -o test.md
pandoc: Cannot decode byte '\xae': Data.Text.Encoding.Fusion.streamUtf8: Invalid UTF-8 stream
However, you can use the tool unoconv to make an intermediary step to convert first to HTML and then to Markdown.
$ unoconv --stdout -f html test.docx | pandoc -f html -t markdown -o test.md
On Ubuntu (And other Debian-based systems I would imagine) you can get unoconv with a simple apt-get install unoconv
.