I am trying to parse German HTML sites with the Palladian Node HTMLParser.
The HTML files are already downloaded on my system.
Naturally the German sites contain a lot of umlaute (ä,ö,ü).
The Parser is not able to parse the documents in UTF-8.
Instead I am getting different symbols like question marks or squares.
Somebody got a workaround for this?