Problem with "Variable Based File Reader" or "iterate list of files" nodes

ChrisFr · September 18, 2013, 11:46am

Hi everyone,

I recently discovered Knime which seemed me very useful & powerful for data analytics, so I decided to play with and to try to replicate the workflow and the results to the example combining text and network mining (the following whitepaper is available: http://www.knime.org/files/knime_social_media_white_paper.pdf).

I downloaded the Slashdot's xml data files and started to build a workflow. However I'm a newbie and I have a problem to extract information of a xml files folder.

More precisely my problem is the following: I'm able to extract categories from one xml file in using xml reader, Xpath and ungroup nodes but when I tried to extract these same categories from the whole of xml files present in a folder and that I use the list files and the iterate list of files nodes, my collected results are not correct. I obtain the right number of rows (equal to iteration number / files number in my folder) but it seems that information of only one xml file has been iterated. So after the iterate list of files node, if I try to parse each row with a Xpath node, the resulting output table has identical rows.

Here is attached my workflow file and 2 xml files.

Would you have an idea to resolve my problem ? What are the right options or basic settings to use in the "Variable Based File Reader" ?

Thank you in advance and congrats to Knime's developpers !!! :)

Iris · September 18, 2013, 5:19pm

Hi Chris,

you were actually on a really good way.

So inside of the meta node "Iterate List of Files" you put the xml reader node. Open its configuration and select the second tab "flow variables". There you can chose a flow variable for each of the dialog options. You need to select for "fileUrl" the variable "URL". (It is generated in each loop from the Tabe row to variable loop start node)

Cheers, Iris

ChrisFr · September 18, 2013, 7:04pm

Thank you Iris for your advice, it works perfectly !!! :)

Where could I find this kind of explanation or information about the numerous settings of each node because the node descriptions don't provide this detailed information ?

Iris · September 19, 2013, 1:17pm

Hm we are currently starting a wiki, where such information should be in the future.

http://tech.knime.org/wiki/flow-variables

ChrisFr · September 22, 2013, 11:41am

Hi Iris,

Sorry for my late reply ! It's a good idea to start a wiki. This way, may be I could be able to build my workflow and to use every nodes without the need to post some trivial questions in the Knime forum !!! ;)

Once again thanks