Executing a row based filter based on changing column names

tomsmilovich · July 31, 2023, 7:07pm

I have a data set and workflow set up in a loop. The loop runs through two Excel sheets that do not have the same row count and column names. I want to filter out specific rows in each data set based on the column name for each iteration. I can’t seem to do that in the workflow because each result only shows me one iteration at a time. Is there a way to identify the column to perform the operation on by column position? Is there a way to designate the operation based on the two possible column names as options?

takbb · August 1, 2023, 7:58am

Hi @tomsmilovich , and welcome to the KNIME community.

I’m finding it a little difficult to visualise what you are doing. In particular, where does the column name come from in each iteration, and what type of loop are you using.

Perhaps if you could upload some small examples of the Excel sheets (with anything sensitive modified to dummy data) and an idea of the output you are trying to achieve, we can better assist. thanks.

tomsmilovich · August 1, 2023, 3:04pm

Thank you @takbb. I’ve included a sample of the output result within the Knime flow to help illustrate. I am trying to filter out the rows with the percentage sign. The columns in which such rows exist are highlighted in yellow. These two fields are coming from my source excel sheets. Because I am using a loop to read across my source excel sheets, the column names highlighted in yellow change since they are different in the source excel sheets. However, they are in the same position in each sheet. I am using a table row to variable loop start to have knime perform a set of manipulations across two excel sheets.
Annotation 2023-08-01 110429.PNG

Daniel_Weikert · August 1, 2023, 4:18pm

you could aggregate columns using column aggregator node and then search for the % symbol in the new column and if it’s there flag it and filter it out.
br

takbb · August 1, 2023, 6:10pm

Hi @tomsmilovich ,

If you really do want to find the column using its position, there are a few options available to you.

Using just non-scripting nodes you could for example use Extract Table Spec, and then a row filter on the Column Index with the required number, followed by a row to table variable to collect the Column Name would achieve it.

In the scripting nodes:

the Column Expressions node, the column() function can be passed an index from 0 upwards to return the column name, or can be passed an offset position from a given column name.

the Java Snippet can also return a column name, using the this.getColumnName( columnindex) method. For the fun of it, I wrapped the java call in a component

tomsmilovich · August 1, 2023, 7:09pm

Thank you for your help. I am unable to see the example you provided in my Knime workbench. How can I import the node?

takbb · August 1, 2023, 8:36pm

Hi @tomsmilovich , this is a “home-grown” component rather than a node so won’t appear directly in the list of KNIME nodes.

Click on the above link to open the component’s page on the KNIME Community Hub. If you are using the KNIME 5.1 Modern UI, you will need to have the KNIME hub open in a browser with you workflow also visible on your pc desktop, then drag and drop the icon on the hub page onto your KNIME workflow

If you are using the classic UI, you can open the KNIME hub panel alongside your workflow, search for the above component and then perform the same drag and drop of the icon from the webpage.

This component will just do the column name look up, but obviously doesn’t fulfil the remainder of your task. I think though that if your drop it into your workflow ahead of the row filter you should be able to then simply include the flow variable name to filter your column.

system · October 30, 2023, 8:36pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.