I have a set of columns that have numbers as their names, from 0 to n (as the actual amount of columns might vary). They each represent values for a specific period of time (so column 0 is the first period, column 1 is the second, and so on).
What I want to do is to calculate, for each row, the absolute % difference between periods, in other words: |(n/n-1)-1|, ending with a table that will have a total o n-1 columns (ideally with a name like “period n vs period (n-1)” or something similar).
If I had a fixed number of columns, this could be done easily with Math Formula, and if the comparison was always vs a same column, the solution with Loop would also be fairly simple, but as the “base of comparison” is changing, I’m not sure how to tackle this.
Thanks a mill,
Would you please provide a sample dataset?
Absolutely, here it goes! File has a sheet with the mock up dataset and another sheet with how the output should look like (I’ve kept the formulas for your reference). The column headers of the output are more of a nice-to-have rather than a must, anything to resemble that is already useful.
Thanks a mil!
mock up dataset.xlsx (21.2 KB)
here is a solution with RecursiveLoop Nodes.
It works with different counts of columns but needs min 2 columns.
I hope it helps.
DifferenceVersusPrevoiusColumn.knwf (43.2 KB)
Well, again we have already a right solution by @Andrew_Steel,
Just in case you prefer less nodes:
n_vs_n-1.knwf (48.8 KB)
P.S. Although my workflow seems shorter, it takes longer to execute. Andrew’s workflow runs about 3~4 times faster!
Thank you both for your inputs! @Andrew_Steel I actually didn’t know about Recursive Loops (so many Loops to know!), it’ll sure help me in other flows. However, when running the flow for my data, I got the following error:
I forgot to mention that I might have cells in my dataset that are 0 or blank, and I think this is impacting the flow. Any thoughts on how to correct this?
@armingrudd your solution works regardless of the missing values, so I’ll be following with that!
Thank you all for your help!
For these kind of challenges there is Column Expressions node!
2019_05_16_Formula vs Previous Column.knwf (30.8 KB)
You are the true genius.
Thanks for this. Learned a lot.
Just to be an arch pupil: The current number of loops misses the last column (12 vs 11 in this case) because the end loop condition is “numberOfColumns() - 1” so it does not process the last column.
Of course this is no big deal. I just wanted to make Ivan’s great idea flawless.
Here is the modified expression which gives the desired output:
for (i = 2; i < numberOfColumns(); i++) arr=arrayAdd(arr,abs(column(i)/column(i-1)-1))
Great one Ivan.
How about taking a look at this blog post which went online a few hours ago:
Thanks, it’ll sure come in handy!
PS 2: Guys, on a side note, any recommendations on reads about expressions (such as the one in Column Expressions)? Also, any resources on how to use variables in the “Flow Variables” tab of the nodes, especially for Column Filters? I mean, I do know how to use flow variables, but I’m looking for something like the “Node Description” but for the options within a Node’s “Flow Variables” tab. For instance, I always have trouble to apply a variable to “excluded_names”
Hi there Guil,
Regarding flow variables I don’t think there is some written documents. The best info you can find will probably be on this Forum
Maybe this blog post can help you a bit to know more about the flow variables:
But as @ipazin said, this forum is always the best option to find answers to your questions about KNIME and beyond.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.