Hi there,
so, what is the input which you’re supplying to the HTML Parser? Is it a URL or a file path to these files, or is this the HTML markup as a string?
There is a known issue with proper encoding detection in some cases, but I currently cannot debug this as I’m on the road.
I suggest you try the following: Read the files to binary data and pass the binary object to the HTML parser. This way the encoding detection will work for sure. You can use the following node from the KNIME File Handling Nodes for that:
Please let me know if this helps!
– Philipp
PS: This is on our list for the upcoming Palladian update.