How to conditionally remove all values from row, leaving the empty row with original RowID?

I need to filter row content based on a condition or a flag, but I need to leave an empty row with its original RowID.

Let’s say my table looks like this:

RowID   Val1   Val2   Val3   Val4   Valid
0       2      5      2      7      TRUE
1       5      4      12     2      FALSE
2       7      2      9      5      TRUE
3       1      11     4      9      TRUE
4       0      7      2      5      FALSE
5       3      5      7      9      TRUE

and I’d like to end up with such table:

RowID   Val1   Val2   Val3   Val4   Valid
0       2      5      2      7      TRUE
1                                   FALSE
2       7      2      9      5      TRUE
3       1      11     4      9      TRUE
4                                   FALSE
5       3      5      7      9      TRUE

The condition (VALID) column may or may not be preserved, it doesn’t matter.

1 Like

@torczyk in short: split rows by the true false column. Filter all columns from the false, Concat both again, all columns of the false should be missing now. Sort by row ID.

3 Likes

Hi @torczyk , @mlauber71 , here’s another way to do this without having to split and concatenate, and consequently no need to sort:
image

Here’s the workflow: Remove all values from columns keeping IDs.knwf (10.8 KB)

2 Likes

Hi @torczyk , @mlauber71 & @bruno29a

Complementary to @mlauber71 & @bruno29a solutions, here is another possible solution based only on a -Multi Manipulation (Multi Column) - node:

The string function used is the following:

substr( $$CURRENTCOLUMN$$, 0,
            toInt( indexOf( $Condition$, "TRUE") + 1) * 
            length( $$CURRENTCOLUMN$$))

@torczyk, you didn’t say if columns where of numerical type but since your expected solution shows blanks (instead of missing values), I’m guessing here that they are of type string. Otherwise, you could use the second solution of the workflow I uploaded which is suitable for numerical type columns and is using the following string function instead:

substr( string($$CURRENTCOLUMN$$), 0,
        toInt( indexOf( $Condition$, "TRUE") + 1) * 
        length( string($$CURRENTCOLUMN$$)))

20220220 Pikairos How to conditionally remove all values from row, leaving the empty row with original RowID.knwf (50.8 KB)

The trick implemented here is to use an indexOf() function as an -If-then-else- function (which by the way does not exist in the -String Manipulation- node). In other words,

if the sentence in column "Condition" is TRUE, then
    the string in the other columns are kept as they are,
otherwise,
    they are replaced by a 'zero length string', i.e. an empty string.

Hope it helps too.

Best,

Ael

3 Likes

Thank you for your suggestion. As far as I know, all data types in KNIME are nullable. My table has column of different types, including Integer, Double, Long, and String.

My pleasure. In this case, the second branch in the workflow I posted should be generic enough to do the job.

Good one @aworker .

@torczyk In my case, I assumed that all of the columns, except for the column Valid, are of type integer, hence why I used the Math Formula, and when I removed the values as per the rule, I made them become NULL/Missing since you can’t have “blank” (which is an empty string) in integer columns.

1 Like

Good one too @bruno29a

Definitely, @torczyk question leaves freedom on how to interpret the problem.

@bruno29a now as you know we say in french, @torczyk a l’embarras du choix (be spoiled by choice) :wink:

Best wishes to you all

Ael

@torczyk for what it is worth here is my suggestion :slight_smile:

3 Likes

Hello @torczyk
Here’s mine:

image

‘Condition’ from LEFT preserve FALSE; in the other hand, from RIGHT don’t preserve FALSE and keep Column sorting

BR

2 Likes

The last 2 answers are solving my problem, with one small “but”… they require adding “sortable” RowIDs (if we need to maintain row order), as default IDs are non-sortable.

Sometimes, supposedly simple things may become truly hard to be done in KNIME (or any other “graphical programming” tool).

2 Likes

@torczyk added two options to generate such IDs:

And indeed sortable RowIDs are a thing with KNIME. Maybe they allow IDs to have a long format when Integer will not do it. Or you might have to construct something elaborate with a string and taking the number of Rows into account …

2 Likes

I’m guessing by “default IDs”, you mean Knime’s RowID? I was not sure what to make of your RowID column when I saw your sample data. Based on the data and columns you provided, my assumption was that your RowID column was a column from your data, since it contained 0, 1, 2, etc… as opposed to Knime’s Row0, Row1, Row2, etc…, and I also assumed that it was of type integer, which is sortable. Knime’s RowID is indeed not sortable as you would want it to be (it’s sortable string-wise).

In the future, it might be better to provide a bit more complete details.

And in anticipation of this, whether you meant your own RowID or Knime’s RowID, that is why I provided a solution where you do not have to sort as it’s not moving any rows.

2 Likes

Unfortunately, I wasn’t authorized to supply a sample of original data. Finally, I went the path with a Rule-Based Row Splitter and a Column Filter.

Aside of that, I truly don’t like the way Joiner generates “new” row IDs, especially when it is joining on RowID.

Yes, that is often the case. But you can always provide dummy data, which you did, but the dummy data should still be in the original structure. That way, it gives an idea what kind of data we are dealing with.

And if you are reading from a csv for example, it’s best to provide the data as a csv. Likewise, if you are reading from an Excel, it’s best you provide the dummy data as an Excel file. Or if the data is the results of a Knime node, you can create the sample data in a Knime table to show the table structure and column types.

As I always say, please help us help you :slight_smile: :wink:

1 Like

Often with questions in this forum I see what @bruno29a also is mentioning that the key to success is to break down one’s problem into a reproducible abstract (though complete) one that can be shared. Which is some work and effort but also more helpful in the end and might be more robust.

I would not go as far as to say that is happening in your case :slight_smile: but more often than not I see businesses not caring about their data and processes and then hoping for some magic from tools like KNIME - which luckily they sometimes can provide.

Yes that can be challenging. If you need to preserve a special rowid there are ways but they are not that nice. Though it does make sense to think about what a Rowid represents for a (business) case.

I set up an example about rowid and joins recently:

2 Likes

Jus for the records,

Cleaning up the RowIDs :

image

Sometimes the solutions are in the box:
20220222_.knwf (29.9 KB)

BR

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.