Hello friends, I am creating my first workflow with Knime. Now I’m at the point where I want to put some process steps together. As you can see in the picture, I have subprocesses which should be part of a complete workflow.
Step 1. Unzip XML Files
Step two. XML parsen to Excel or SQL (I know the process is not ready yet but anyway)
Could you please tell me how you would do that?
My goal at the end is to create an automated workflow that unzips data and parses XML files.
in order to run List Files (Node 12) after Loop End (Node 10) connect them with flow variable connection. Check in this video for more on flow variables and how to do it.
Hey Ivan, i tried to create a flow variable but and it works but i cant really use it. Do you have an Idea why i got these Error. “WARN List Files “Location” does not exist or is not a directory” "?
The created a global flow variable here in the workflow.
there are couple of things to note here. First, glad you are trying different things and learning KNIME in this process
Second, when I said to connect nodes with flow variable connection I meant on red line connecting two nodes and not creating flow variable (you do not need to have flow variable in order to connect two nodes like this). Further on, List Files node needs directory path in order to list files from that directory. Your flow variable has value Location which is not a directory so error comes from there. You do not need (global) flow variable in this case.
Third, the output of Unzip Files already contains your .xml files which you want to read into KNIME so I would connect Loop End node to Table Row to Variable Loop Start node (or directly put XML Reader after Unzip Files and have only one loop!). If necessary, and probably is in both cases, you should remove rows which do not contain .xml files using Row Filter node with some regex. Or another option (to have only .xml files) is to use URL to File Path node and then Nominal Value Row Filter node on type column
A lot is said. Give it a try and come back. If you will still have big issues I can create you an example workflow
I think this topic is solved and can be closed now (regarding the topic title it was first solved by @ipazin).
As requested by @PhilippBobo, the secondary issue (the XPath) was reviewed and solved in an online session.
The XPath format I already mentioned was the key.
When we want to select an element in an XML by some attribute value, we have to put [@attribute='value'] (for single condition) or [@attribute1='vlaue1' and attribute2='value2' and ...] (for multiple conditions) in front of the element. So only the elements which have the desired attribute value would be selected.