Row Filter with Mulitple Criteria Question

auplinger · November 29, 2012, 7:23pm

I would like to use multiple criteria (some with wildcards) to filter a large table (Table A) with multiple columns. My current strategy is to use a Row Filter Node for each filter criteria. With 10 filter criteria, this means 10 Row Filter Nodes.

I would prefer to store the filter criteria (column, filter value, etc.) in a file/database. The filter criteria would then be applied to each row of Table A as it is processed. I have tried the following implementation:

XLS Reader (contains filter criteria) -> TableRow to Variable Loop Start -> Row Filter (data input is from Table A, Variable Inport to define filter criteria) -> Loop End

Issues:

Table A data is iterated over multiple times. Is there way to process it one time only?
The output concatenates all of the iterations. For example, if the input data table has 20 rows and there are 10 filter criteria, I believe the output will have 10*20-(#rows filtered). The desired output would be 20-(#rows filtered).

Any help/recommendations would be appreciated. Thanks in advance.

Andrew

richards99 · November 29, 2012, 7:49pm

I think you may need the delegating loop start and end nodes. Such that the filtered output from the first loop is passed back to the start for the second loop.

may be complex though, as you will still need your table row to variable loop start node loop as well.

An alternative, but inefficient way is to use the row filter node to pass out the filtered out rows only and then take your concatenated output and then do a reference row filter with the initial table.

Simon.

auplinger · November 30, 2012, 2:24am

Hi Simon,

Thanks for the response. My attempt to use the Delegating Loop Start Node was not successful. Screen shots of the example workflows are shown in the attached figures:

knime_workflow1.png - The Delegating Loop Start Node is excluded from the workflow

knime_workflow2.png - Delegating Loop Start Node is included in the workflow, halting the workflow and creating the following console message:

WARN WorkflowManager Unable to merge flow object stacks: Conflicting FlowObjects: <Loop Context (Head 4:42, Tail unassigned)> - iteration 0 vs. <Loop Context (Head 4:45, Tail unassigned)> - iteration 0 (loops not properly nested?)

I searched the forum and I believe the problem is that the loops are not nested correctly. I am not clear how to properly nest the loops to avoid this problem. Any tips/suggestions would be appreciated.

Thanks,

Andrew

Iris · November 30, 2012, 2:50pm

Hi Andrew,

I will attach you a workflow which filters a string multiple time based on the configuration table.

Hope this helps, Iris

multirowfilter.zip

richards99 · November 30, 2012, 3:33pm

Very nicely done Iris!

Simon.

auplinger · November 30, 2012, 10:02pm

Hi Iris and Simon,

The solution Iris created works superbly and was very easy to adapt to my workflow. I really appreciate your rapid response to my inquiry.

Thank you,

Andrew

8mm · June 6, 2013, 7:50pm

Hello everyone,

I know this is an old thread, but I have a quite related question: How do I include only those nodes containing one of multiple criteria. So I'm interested in a similar solution to the one of Iris, but instead, of filtering out the rows containing "test" or "bla", I'd like to include those. For some reason I cannot make it work. I guess I need a break.

Any help is much appreciated.

Best regards

aborg · June 6, 2013, 10:02pm

If you can find the rows to exclude, you can use that table in conjunction with the Reference Row Filter to exclude those rows.

Cheers, gabor

8mm · June 7, 2013, 12:47pm

Hey aborg,

thanks for your quick response. The next challenge would then be to determine the rows where column x contains the values "a" or "b" or "c".

My inquiry just based upon the workflow Iris posted in this thread. It works fine for me too, but it filters out all rows, containing "a" or "b" or "c". What I'd like to have is the opposite: All rows containing either "a" or "b" or "c".

It must be pretty simple, but I couldnt make it work. Shame on me.

Best regards

aborg · June 7, 2013, 9:55pm

Keeping the right that I misunderstood something... Would a workflow like the attached be suitable for you?

Cheers, gabor

filter.zip

8mm · June 8, 2013, 1:26pm

Hey aborg,

now I understand. Yes that works fine, but it does not allow for the usage of wild cards, does it? I'd like to filter for " Cluster* " which should return the entire table, but it creates an empty table, since the asterisk is not interpreted as a wildcard.

Best regards

aborg · June 8, 2013, 2:59pm

That is true. In that case you should create the input table (the one where I used Table Creator) with a workflow similar provided by Iris (using regular row filters). It might worth using a GroupBy node on it to keep the table small by removing the duplicates. That way you will have all possible matching values you want to (reference) filter for.

8mm · June 9, 2013, 1:01pm

Hey aborg,

yes, that'll work hopefully. Thanks for your help.

Best

unknown_user · July 4, 2013, 1:48pm

Thanks Iris!

I've been struggling to implement a recursive workflow with unnested (cross-nested?) loops using the Delegating Loop nodes, and your workflow was exactly what I needed.

A very clever solution!

(the other) Simon

Selster · September 20, 2020, 4:40am

I downloaded your multiple criteria workflow but I suspect that because it has a .knime ext rather than a .knwf that Knime won’t import it??? Is there a way to convert it so that I can see it?
Thank you,
Steve

Iris · September 20, 2020, 8:24am

Hi @Selster

the workflow can still be imported by using File -> Import Workflow. Than you can just select the zip file.

Having this said, I made the workflow 7 years ago and there are quite some new features in KNIME which will make this easier. So if you want to start a new thread and explain what you want to achieve, we are happy to make you a most recent example.