Extracting links from a text

In case you want to use the HtmlParser from Palladian, you can apply the following workaround: Convert the input column which holds the string to a binary cell, and use this as input for the HtmlParser, then use the XPath nodes as common.

(the simple reason that the input to the HtmlParser needs to be binary is, that strings are treated as file paths)

– Philipp

3 Likes