Hello, I'm using the XPATH node to parse an XML document, and am not sure how to handle the case where I have a variable number of elements inside an XML element. I explored the Drugbank knime workflow example, as it reads the drugbank XML file, but didn't find a matching example.
consider this XML:
<DiseaseName>very bad disease</DiseaseName>
I can use an XPATH to point to the SynonymList node for each disease, and chose 'string' as output but get
"ADAXD" as the SynonymList output in my table. It would be nice to have a way to specify a delimiter in the string creation...
As I know the number of synonyms, I could try a loop, and each time try and read the specific element I want, but I'm just discovering XPATH, and not quite sure how to do this. Is there a simpler way? Any examples? (The drugbank workflow seems to ignore this exact case as there is a Synonym tag in drugbank that works the same way...).
the Extract Categories meta node of the DrugBank example workflow extracts all categories of a drug which is similar to the synonym extraction. First it extracts all categories with the Node-Set (Collection of XML cells) return type in the XPath node. the returned DataCell contains a set of XML cells that contains the category strings. To extract the string using a second XPath node the set cell must be ungroup.
Ok thanks! My main problem was with my XPATH statement, I was getting back XML one level too high, so the ungroup wasn't ungrouping each synonym individually... Now its working!
I recently discovered Knime which seemed me very useful & powerful for data analytics so I decided to play with and to try to replicate the workflow and the results to the example combining text and network mining (the following whitepaper is available: http://www.knime.org/files/knime_social_media_white_paper.pdf).
I downloaded the Slashdot's xml data files and started to build a workflow. However I'm a newbie and I have a problem which is related to that of Keith.
More precisely my problem is the following: I'm able to extract categories from one xml file in using xml reader, Xpath and ungroup nodes but when I tried to extract these same categories from the whole of xml files and that I use the list files and the iterate list of files nodes, my collected results are not correct. I obtain the right number of rows (equal to iteration number) but it seems that information of only one xml file has been iterated. So after the iterate list of files node, if I try to parse each row with a Xpath node, the resulting output table has identical rows.
Here is attached my workflow file and 2 xml files.
Would you have an idea to resolve my problem ? What are the right options or basic settings to use in the "Variable Based File Reader" ?
Thank you in advance.