Populate new table inside loop

andrejz · May 19, 2020, 12:53pm

I am importing many excel files with less or more identical data structure. Inside the loop I modify the table columns so the output is one table.
After the loop I also modify this table (add new columns and join with other tables). I think that the workflow works well but in last case I have found that in one join some rows were duplicated. The total number of rows must be equal to the sum of all rows in the excel files. I find the error and resolve it but …

What I want is to create a separate table inside the loop (group some columns and count an sum some values) and compare this values with the values in a table (with same group method) at the end of the workflow. So I need that the (temporal) values in the GropuBy node inside the loop are stored in a separate table every time the excel file is processed.

How can I do this?

Thank you
Andrej

AnotherFraudUser · May 19, 2020, 8:12pm

Hi @andrejz,

for the problem with the join, I think you have to provide your workflow to actually check what is wrong
If true duplicates are created, then you could use the GroupBy Node on all columns (without aggregation) to make them distinct

For your second part I actually do not understand your problem (maybe you can clarify?)
You can use multiple groupby nodes in the same loop, so you could split the data path, make your comparision and later combine the data path again with a switch end

andrejz · May 20, 2020, 6:51am

Thank you for the reply.

I solved the problem with the join (I have made a mistake in the configuration).
On the experience with this join I want to make two additional tables to compare them when the workflow ends to see if there are more errors like this I have made with the join (to see if certain properties remain the same as the number of rows for example).

I have made a very simple workflow to demonstrate what I mean (check number of Customer_id and sum of Total during importing the data and at the end of the workflow).
One solution is to put the GroupBy node at the end of the loop. Then I compare the tables “Compare table 1 - alternative” and “Comapre table 2”

But how can be done with the GroupBy node inside the loop or more generic how to produce a table inside a loop with partial data which is produced every iteration (like GroupBy - Compare table 1 node)?

Thank you

Test.knar (87.6 KB)

AnotherFraudUser · May 20, 2020, 12:26pm

Hi andrejz,

I think I now understand your problem
I think you are looking for the loop end (2 ports) node:

There you can create two tables inside the loop and put out both tables in the end
Attached your workflow slightly changed
Test.knar (59.1 KB)

However if you need more than 2 tables - then I would suggest to run the loop multiple times (not the prettiest solution by most likely the easiest:
->see workflow1

Hope this somewhat helps

andrejz · May 20, 2020, 12:56pm

I have made some little changes but you solved my problem

Thank you

Test_solution.knar (57.3 KB)

AnotherFraudUser · May 20, 2020, 1:15pm

Great!
Thanks for sharing the final solution

andrejz · May 21, 2020, 5:39am

For the KNIME staff (developers) … If it is needed more than two tables maybe the node “Loop End (2 ports)” can be upgraded to add input ports (and automaticly add output port) like the “Concatenate” node

ipazin · May 22, 2020, 2:57pm

Hi there @andrejz,

glad you made it.

Regarding Loop End with dynamic ports it was wished here as well:

Will check it and get back to you.

Br,
Ivan

system · May 29, 2020, 2:57pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.