String Manipulation

Halo,
Here is the form of my data


And I want to remove row0 - row3 and row 27. And still want to keep row4 to row26.
But do you know how to solve this?
I used the Row Filter Node, but it didn’t solve my problem.

Warm Regard,
Veni

Hello,

Please use the Rule-based Row Filter node and configure it with the expression:

" $$ROWID$$ MATCHES “Row0” OR $$ROWID$$ MATCHES “Row1” OR $$ROWID$$ MATCHES “Row2” OR $$ROWID$$ MATCHES “Row3” OR $$ROWID$$ MATCHES “Row27” =>TRUE "

and select redio button of Exclude TRUE matches . I attached an image for reference.

Output

3 Likes

Hi @veniapputrii ,

It would be better if you gave a title to your post that was somehow related to the actual problem. Your question appears neither related to “String Manipulation” and nor is there any evidence of there being a “bug”.

I have assumed you want to KEEP Row4. Your post had a contradiction as you included Row4 in both the rows to remove, and the rows to keep.

I am also going to make an assumption here that your actual question is how to remove the first X rows and the last Y rows, because what you have is a table with header and trailer rows that aren’t part of the actual data you are wanting to process. So…

Firstly, which version of KNIME are you using?

If you are using KNIME 5.1 onwards, then if the aim is the remove the first X rows and the last Y rows, your hands-down best node for this purpose is the Table Cropper. Specify the row number of the first row you want to keep, and counting back from the end (last row=1) tell it where you want to stop. In this case you want to keep rows starting at 5 and finishing at the second row from the bottom (2 counting backwards)

If you aren’t using KNIME 5.1 or later…

You could elect to KEEP rows based on the row number rather than RowId


If my assumption is incorrect, and you are just wanting to remove specific rows manually, then…

For your specific case you could use Row Filter as follows, with regular expressions:
Row[0-3]|Row27

Row[0-3]|Row27

The [0-3] part will work because they are covering single digit RowIDs.

You could also write this out as:
Row0|Row1|Row2|Row3|Row27

If they’d been multiple-digit ranges this would have been a lot more difficult to do with regular expressions.

For other continuous ranges, you could use Exclude by Row Number, eg with two row filters , One to exclude Row 28 (row numbers start at 1) and a second to exclude Row 1 to Row 4.

Note I did them in reverse order so I didn’t need to find out the new row number for Row27 after I’d deleted Row0-Row3.

But, all that being said, if you have a number of rows to manually delete Rule Based Row Filter, as @tqAkshay95 has said, is another option

Similar to @tqAkshay95 's post, you could also list your rules as:
$$ROWID$$ = “Row0” => TRUE
$$ROWID$$ = “Row1” => TRUE
$$ROWID$$ = “Row2” => TRUE
$$ROWID$$ = “Row3” => TRUE
$$ROWID$$ = “Row27” =>TRUE

You could also write this, again for the specific data (assuming you are removing first 4 rows and final row) as:

$$ROWINDEX$$ >= 5 AND $$ROWINDEX$$ <=27 => TRUE

Note that technically “=” is should be used in this case in preference to MATCHES since MATCHES is for Regular Expression matching, and whilst that isn’t an issue in this case, it could lead to unintended consequences if you happen to have any regex-command characters in the RowIDs such as “.”. For wildcard matching would use LIKE.

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.