Execute failed: ws.palladian.retrieval.parser.ParserException: org.xml.sax.SAXParseExceptionpublicId: -//W3C//DTD HTML 4.01 Transitional//EN; systemId: http://www.w3.org/TR/html4/loose.dtd; lineNumber: 31; columnNumber: 3; The declaration for the entity "HTML.Version" must end with '>'
This seems be a problem with all Yahoo feeds, but also with some other feeds. Is there a possibility to avoid this problem?
I'm currently on holidays and cannot reproduce this issue, but from your description I assume, that you are not getting an RSS/Atom feed but an HTML page. The URL which you provided points to a valid feed though.
I will have a look at the problem and get back to you.
I just tried with the URL you provided and it works fine for me (sample attached). I suspect that for any reason you are getting an error page instead of the feed (maybe because of internal proxy issues, WLAN authentification, ...).
It should alrady help to examine the result which is retrieved by HttpRetriever to debug this issue.
This is the case for all feeds. (My first example was our company's webpage and I thought that this was an "external" resources, not intranet.)
Maybe you can add this feature request regarding the proxy authentification also to your list for a next update - as already mentioned for the WebSearcher node.
Normally I define the network connections via File > Preferences > Network Connections together with the authentification details (User/password). These details are then used by nodes like FileReader and XMLReader when accessing external resources.
unfortunately, I'm having the same problem using an identical workflow: Table Creator -> HttpRetriever -> FeedParser in KNIME 3.1.1.
Everything seems fine until the process hits the FeedParser node. The console shows me the following error: "Execute failed: Unexpected input type: MissingCell".
I'm trying to parse the RSS feed from http://www.heise.de/newsticker/heise-top-atom.xml.
The HttpRetriever returned a Missing Cell, which means the download was not successful. Have a look in the console output which should show you an error message. You can also enable DEBUG logging to get a more detailed description and post the output here.