Convert a PDF file to HTML

So I have convert a PDF to a HTML file and I’m using then, the HTML Parser to read the file.

But unfortunaltelly the result is not good.

I 've use a python library to convert the PDF File to the HTML file. You can find the file in the attachments.
Nestlé.txt (36.6 KB) Nestlé.xml (36.9 KB)

Ps: I can’t upload pdf and html files in this post, so i change the extension to txt and to xml.