Hi,
I have a problem regarding the generated output of my XPath node.
Basically what I want to do is to collect all related categories from a wikipedia article using an XPath command.
The relevant xhtml code (example article) looks as follows:
<div class="mw-normal-classcatlinks" id="mw-normal-catlinks"> <a href="https://en.wikipedia.org/wiki/Help:Category" title="Help:Category">Categories</a>: <ul> <li> <a href="https://en.wikipedia.org/wiki/Category:Berlin" title="Category:Berlin">Berlin</a> </li> <li> <a href="https://en.wikipedia.org/wiki/Category:German_state_capitals" title="Category:German state capitals">German state capitals</a> </li> <li> <a href="https://en.wikipedia.org/wiki/Category:Capitals_in_Europe" title="Category:Capitals in Europe">Capitals in Europe</a> </li> <li> <a href="https://en.wikipedia.org/wiki/Category:City-states" title="Category:City-states">City-states</a> </li> <li> <a href="https://en.wikipedia.org/wiki/Category:Members_of_the_Hanseatic_League" title="Category:Members of the Hanseatic League">Members of the Hanseatic League</a> </li> <li> <a href="https://en.wikipedia.org/wiki/Category:Populated_places_established_in_the_13th_century" title="Category:Populated places established in the 13th century">Populated places established in the 13th century</a> </li> <li> <a href="https://en.wikipedia.org/wiki/Category:Populated_places_established_in_1237" title="Category:Populated places established in 1237">Populated places established in 1237</a> </li> </ul> </div>
Now using one of the XPath nodes (normal or deprecated) with query
XPath:
//dns:div[@id="mw-normal-catlinks"]
resp.
XPatch (deprecated):
//xhtml:div[@id="mw-normal-catlinks"]
as a result I get a column containing the different categories, which is nice, but unfortunately with the words attached together as one whole string, looking like:
Categories: BerlinGerman state capitalsCapitals in EuropeCity-statesMembers of the Hanseatic LeaguePopulated places established in the 13th centuryPopulated places established in 1237
Is it possible to add a space between the category tagwords or a comma or the like? e.g. like:
Categories: Berlin, German state capitals, Capitals in Europe, City-states, Members of the Hanseatic League, Populated places established in the 13th century, Populated places established in 1237
Or even better, to get every category in an own cell, column wise (going like catword1, catword2, and so on).
I appreciate any help!
Thanks,
Manu