Help with rule-based row-filter - according to excel formula

Hello together,

I am really new to KNIME and looking forward to get some help from you. I have the problem, that I have 5 mio rows in CSV and want analyse them. Until now, I alway splitted the data into 1 mio parts. After that I proceed 5 times all the analysis with 1 mio rows, did a PIVOT table and summed the 5 parts together. It took up to 5-6 hours to do that. KNIME was the suggested idea to do it. But I don’t know enough to work with it, unfortunately.

In Excel I used a specific formula to qualify the data and to sort out some rows. Therefore it would be great if you could help me with the right rules.

What kind of rules do I need …

  • if cell contains “XYZ”, then it is not included
  • if cell contains “ABC” on position 3-6 in the cell, then it is not included
  • if cell contains “TEST” , then it is included

Many thanks in advance. Kind regards.

@DennisMB welcome to the KNIME forum. I have prepared this example on the KNIME hub. You can write a column expression testing a single column for your conditions (0 = a negative condition is met, 1 = the positive condition is met, 2= something else, maybe also a 1?)

Then for the fun of it I have built a system where you could apply the rules to all columns (you want). If we have one negative in a set of columns it is a negative overall and we would need at least one positive mark in all the columns. Other combinations are of course also possible.

The Multi Column node would allow for a combination of several conditions to be applied (true= keep it, false = drop it if a negative condition).

kn_forum_63233_simple_filter.knwf (38.4 KB)

1 Like

Hello @DennisMB
I’m beginner too, thus I’ve prepared a sample worflow using Row Filter nodes only, not Rule-based Row Filter nodes.

I’ve used just patterns with wildcards and apropriate node configurations like this one:

Please find the workflow attached.
Happy KNIMEing :slight_smile:
Multicolumn_multifilter.knwf (33.3 KB)

1 Like

Many thanks to you both. I made some experience with the rule based row-filter and with column expressions. I am not far away from reaching my goal. What I need is an expression with AND. Could you please have a look on my expression and tell me where I did a mistake?

My expression was:
if (and(column(“Testing”)=1, column(“A_Validity”)=31/12/2999, column(“B_Validity”)=31/12/2999 )) {1}
else {0}

Error Message is: “Invalid left hand side for assignment”

Kind regards

In column expression you need to use && instead of AND. I believe they are working on getting AND implemented as well for a later release of KNIME.

if (column("Testing") == 1 && column("A_Validity") == "31/12/2999" && column("B_Validity") == "31/12/2999" ) {
    1
} else {
    0
}
2 Likes

Oh that worked very well. Thank you. I didn’t know that I have to use == and && instead of = and AND. Additional question: How would the same formula look like in a rule-bases row filter?

The row filter has AND :slight_smile:

$Testing$ = "1"  AND $A_Validity$ = "31/12/2999" AND $B_Validity$ = "31/12/2999"  => 1
TRUE => 0
3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.