Rule base column filter

hello i am in a bit of a pickle i have a job where i did the entire Knime to then learn that each month because of the system we use the columns change position that causes the Knime to break, i can not use any rows as the header because the header in the middle of the document is in different rows so the only form im thinking on how to do it would be to just separate the document with another knime where it will only keep the columns i require

so as a example i will have a column where i have in one of its cells the string “monthly provisions:” keep that column if not remove it

is there any rule base column filters that can do that or even a code in the rule engine that would be able to do it or is that not possible?

Are the header names the same every time? If so, then perhaps you could identify the row number that contains the headers by combining all of the columns into a new column containing a unique list of the values in each row, and then filtering to the header column using a formula that looks for the header names. Start with Create Collection Column to aggregate all of the columns into a list.

If you want to do the test on the side, then I would start with the Counter Generation node, that way you could just reference back the row number of the headers to the main table once you have identified it so you can use it with the Row to Column Names.

Hi @Bleck , some visual sample would help understand a bit more the use case.

There are a few ways to do this, as long as there are some patterns that can be followed to identify/determine which columns you want to filter.

Quick possibilities: You can for example filter using regular expressions within the Column Filter node itself. You also have the possibility of extracting the column headers, and work with the headers: apply different rules or manipulation until you get to the results of what headers you want to keep, and pass the list of the headers as a variable to the Column Filter.


i made a replica of the document, its normally bigger but this explains the basics
so i need to work only on the columns from N to V unfortunately the columns of the document change every month some times is one to the left or one to the right so i cant use the header with the leters to work ok i tryed the header (vacation - charges - INSS RAT etc) unfortunately they are not in the same line that does not change that frequently and normally are on rows 4 and 5
now im trying to see if there is a way for me to separate the columns in front so i only have the Index on column N and columns P Q R S T U V
maybe some kind of column filter that keeps a column that has the word vacation in it sure that would already be perfect

thank you for your patience

Can you upload an excel doc or KNIME workflow that has the same table, but with dummy data that can be shared? I think my suggestion above solves this easily, but it is quicker and easier to show you rather than explain.

Also, are you saying that the Column names can be split between rows 4 & 5? Do those split column names always fall in row 4 & 5, or are you saying that they can occasionally be shifted up and down (such as rows 5 & 6)?

If they consistently hit the same row numbers, then I recommend adjusting the Reader node to make row 4 the 1st row read into KNIME. Then you can combine the 1st 2 rows using a Multi-Row formula, and promote it to the column header.

Multi Row Header.knwf (93.5 KB)

Here is an example I threw together of one way you might target an unknown header row.

test.xls (9.5 KB)
here is a good example

Hi @Bleck

See this wf filter_relevant_columns.knwf (83.7 KB) step one is to identify the column that contains “Monthly provision” (which can be any column in your input sheet). With the reference to that column name (as as flow variable) you can filter (step 2) all columns from that column onward. Step 3 is to create the column header. Does this help?


The final table:

gr. Hans

4 Likes

this actually goes above and beyond what i wanted now i just need to see the best way to implement on to my system but you are a life saver thank you very much

1 Like

Hi @Bleck Thank you. Nice to read that you like the solution. Happy KNIMEing

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.