Solutions to "Just KNIME It!" Challenge 14

alinebessa · April 27, 2022, 12:07pm

This thread is for posting solutions to “Just KNIME It!” Challenge 14. Feel free to link your solution from KNIME Hub as well!

Here is the challenge of the week: Just KNIME It! | KNIME

Have an idea for a challenge?? We’d love to hear it! Feel free to write it here .

HansS · April 27, 2022, 1:17pm

That was fun to do.

My solution uses a MovingAggregation node to do a cumulative computation from top to bottom and from bottom to top. So all values of 0 from both cumulative computations can be excluded from the file.
See: Challenge 14

gr. Hans

alinebessa · April 27, 2022, 5:09pm

Whoah! This is very surprising to me!

gonhaddock · April 27, 2022, 6:27pm

Hello KNIMErs,

Here is my solution to #justknimeit-14 :

KNIME Hub > gonhaddock > Spaces > Just_KNIME_It > Just KNIME It _ Challenge 014

My first thought was the Moving Aggregation similar to @HansS solution. Then, I ideate an alternative option based on the usability of ‘Missing Value’ nodes. Both are similar.

BR

alinebessa · April 27, 2022, 6:49pm

You guys are killing it with the creativity today!

lelloba · April 27, 2022, 7:59pm

Hello everyone,

I struggled a little bit with lagging and stuff, so I decided to solve it using R.
I used a table creator node to modify the input table a little bit, just to be sure that even if you have a long list of zeros in the middle of the input, the model is not recognising it as noise.

justknimeit - 14 - Raffaello Barri

Have a nice evening,
RB

martinmunch · April 27, 2022, 8:47pm

here’s my solution

My solution centers around calculating running sum in both directions and removing rows with zero (0). The default rowid’s cannot be numerically sorted (obviously), so I had to add a counter generation node to sort on.

cheers

rfeigel · April 28, 2022, 1:01am

Here’s my solution. I included a second input table for testing.

REF Challenge 14.knwf (27.4 KB)

MEPivnenko · April 28, 2022, 8:40am

We could invent a bicycle, but there is already a function to strip the string. So my solution is as simple as group > strip > collection/list > ungroup.

Vexatious_Outlier · April 28, 2022, 11:47am

My first solution was to use a lag column and moving aggregation (for cumulative sum) to identify groups. Then I used a groupby node to find the max group number, which I filtered out if it was full of zeros. I decided that was all overkill since I was only interested in the last group, so used a switched to using a Moving Aggregation (again for cumulative sum) to identify leading zeros and flipped the data to do it on either end. Looks like my solution is almost identical to HansS’.

Link to workflow

Vexatious_Outlier · April 28, 2022, 12:27pm

Uploaded my first version for completeness since I mentioned it.

alinebessa · April 28, 2022, 2:10pm

Super cool! I also had not thought about that!!!

badger101 · April 29, 2022, 3:29am

Here’s my go at this. Not attempting to provide an efficient way, but merely an alternative solution; a unique approach to solve an integer-based issue using KNIME Text Processing partly.

Knime_Challenge_14_Answer_by_Badger101.knwf (116.9 KB)

ersy · April 29, 2022, 8:19am

Hi everyone,
Here is my solution.
Tried to solved it without variables.

Thyme · April 29, 2022, 10:11am

Lots of different implementations that I’d never be able to come up myself. Good job!

I did several variations. They all use the Moving Aggregation node for some reason:

The 2 topmost branches do the same thing with different nodes:
Step1: append a cumulative sum column and push its maximum value to a Flow Variable
Step2: use a row filtering node to filter the data set
Those sections can be mixed and matched from both branches.
We can also bypass the need for the column sum by doing a reverse column aggregation instead. The Moving Aggregation node requires the row number in this configuration.
More a proof-of-capability than anything useful, the Math Formulas IF-function can decide which rows to keep/filter. Needs a Row Filter to execute that decision.
Finally, the Node Golf version:
Step1: Java Snippet to find first and last row holding data; push it to Flow Variables
Step2: Row Filter to filter by rowIndex (Flow Variable controlled)
This is probably also the fastest to execute.

JKI_014950×1128 199 KB

gonhaddock · April 29, 2022, 10:35pm

Looking at all these imaginative solutions gave me some insights, then my challenge workflow has been upgraded from two to four different optional approaches. There’re tons of fun in this challenge, we could keep ideating approaches forever.

Tying to @Thyme 's, my four solutions:

Moving Aggregation cumulative computation.
Missing Value downwards (previous value) and upwards (next value) infill.
Conditional Loop End, computing stripped starting and trailing noise lengths.
String concatenate and Regex code to strip starting and ending zero values.

Last one inspired from @MEPivnenko 's concatenate (I didn’t check into the code avoiding spoiler); then deepening into ‘non-greedy’ Regex code with some help from forum and @ipazin solution.

BR

AnilKS · May 1, 2022, 11:08am

My try on Challenge 14 - Easy as it seems … while on try it didnt seem so …

rfeigel · May 1, 2022, 9:59pm

My first submission only removes the first and last zeros. This one works correctly. Thanks to Ivan Pazin for the regex I used.

REF Challenge 14 Rev 1.knwf (50.6 KB)

alinebessa · May 3, 2022, 1:56pm

As always on Tuesdays, here’s our solution to last week’s challenge.

We did not modify it after seeing your super cool solutions just to add to the diversity here!

See you tomorrow for a new challenge!

myxiao · May 5, 2022, 12:02pm

In this workflow, the data rows are first labeled as 0 and non 0. Then add them in a list, use the Column Expression node to find the first and last index of the non-0s. The first and last index are then transferred as flow variables “start” and “end”.

Then loop through the table row by row, and check if the current iteration number is within the range defined by the index “start” and “end” variables, and add the column “include”. The rows are then filtered out if they have a “false” value in the "include"column. Then remove the “include” column.