Row by row comparison in the loop (iteration)

Hello knimers, new user here!

I’m looking for a solution regarding my phd research where i need to do some data wrangling, ie. transform my dataset into format suitable for graph/network analysis or to be more precise - make links from the existing nodes.

Anyways, I’ve built a workflow where I’m updating “EndOfExport” column - whenever “X” is found, today’s date is inserted. In the second step, i need to get the exact output like the one I’ve attached - and i need your help with it.

That means that i need to do row by row comparison in the loop until the end of the table. Each iteration should start form the next row and make comparison with the each row until the end of table (top-bottom).

Comparison is based on the CountryID and i’m comparing
1.) StartOfExport of CountryID in the row X with the StartOfExport
of CountryID in the row X+1 where i need to take out the larger date;

2.) EndOfExport of CountryID in the row X with the EndOfExport
of CountryID in the row X+1 where i need to take out the smaller date;

Please look at the Test - input - example.xlsx & Test - output - example.xlsx and you will get the general idea about the problem.

I’m working on a short notice so any advice/workflow would be very much appreciated!

Test.knwf (12.4 KB)
Test - input.xlsx (11.2 KB)
&
Test - input - example.xlsx (9.9 KB)
Test - output - example.xlsx (8.9 KB)

Hello @vonschultz666,

nice cat in your profile pic.

Here is my solution - I haven’t used loops: do you have a lot of data? Because otherwise loops are not necessary.

Let me know if it can work for you.

NB: I made a small edit in the first part of the workflow and used the today() formula inside a column expression node.

Have a nice evening,
RB

3 Likes

I updated my workflow and added an alternative with a loop, so you can choose which one suits you the best without asking for the alternative and wait for my reply.

RB

2 Likes

Hi @lelloba - that is almost it! Thanks!

Nevertheless, one more thing i need to filter out are redundant (“same, but column inverted”) combinations of CountryID and CountryID_right, meaning:

now i have the following output:

  • CountryID;CountryID_right
    44;43
    64;32
    43;44
    32;64

and i need only the following:

  • CountryID;CountryID_right
    44;43
    64;32

Furthermore, solution with the loop is definitely better option performance wise!

btw. @lelloba is there any way to get the last day of the current month without using this - Previous Month Dates – KNIME Hub ?

ie. we have this:

if (column("EndOfExport") == "X") {
    replace(column("EndOfExport"), "X", today())
} else {
    column("EndOfExport")
}

but would like to get last day of the current month?

Hi @vonschultz666 ,

here are the edits you have asked. Same link, but updated worflow.

  1. I have added a R script to remove all duplicates. Tell me if it’s ok. It takes a while, but it seems to work fine.

  2. I have recycled the nodes of this link Previous Month Dates – KNIME Hub to compute the last date of the current month.

RB

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.