Special formula needed for filtering

Hello guys,

today I have a very simple question for most of you, here is the deal, I have a column which contains strings like these:

123ertr215
1254
ssdd4876
drbnn
254dfg

So the thing is I need to filter ONLY the rows that doesnt containg letters, and only leave the lines that contains numbers. I was tring this formula but not working $Reference Document$ LIKE “[^a-z][A-Z]” => TRUE, can anyone assist please?

What about

1 Like

Hi @GQRanalytics,
Check this example I’ve made for you, there are three approaches, maybe there are others too.

Feel free to download this and play with it.

Hope it helps,

Cheers.

1 Like

Hi @eamendola I saw the String Manipulation Node in your suggestion above. I’m in the process of learning about Regex. Can I ask a question on how one can use the boolean flag command as you described? I tried it here:

Can you tell me why it’s giving me errors? Thanks!

1 Like

Sure thing ! I think is just the syntax of your command.

The toBoolean function expects an expression inside that evaluates to either True or False, then it returns the column converted to a Boolean type. For example, in this case you can use it like this (in my example my column is called data):

The regexMatcher function will tell you if a value of your column matches the regex expression inside double quotes. See the output ? It’s a B type, which states for Boolean type, whereas in my example, I used without the toBoolean function, so my output was a mere string containing True or False, it wasn’t converted to boolean type:

Let me know if this clarifies your doubt.

PS: Check your regex expression, it should be .*[a-zA-Z].* with a dot before each wildcard (*)

1 Like

Hi @GQRanalytics , you can do this in just 1 node, either the Rule-based Row Filter or the Row Filter.

Here’s an example with the Row Filter:
image

Input (same as your sample data):
image

Results:
image

And here’s how the Row Filter is set up (I highlighted the configurations that need to be set):


Expression used: ^[0-9]*$

Here’s the workflow: Filtering rows with regex.knwf (6.4 KB)

EDIT: Initially I actually used the same approach as @eamendola , which is check for [a-zA-Z] and exclude the match, but this would not work for other/special characters, such as “-” or “_”, etc.
For example, if you have 12_12, it would not filter out, and I don’t know if it should or not as 12_12 is not numbers only, but it does not contain letters. So I changed the approach to instead check for numbers only.

If you really mean to filter rows with no letters only, then you can use this Row Filter instead (that’s what I had initially):

I added both option so you can choose which one suits your need:
image

I added this line for the input data:
image

Results for Numbers only:
image

Results for No letters:
image

Here’s the updated workflow: Filtering rows with regex.knwf (8.0 KB)

@eamendola , your 3 options will not give the same results. Your first workflow (String to Number) will give different results. It will give the same results as my “Numbers only” version, and your other 2 will give the same results as my “No letters” version

2 Likes

Hi @badger101 , there are a few things that’s wrong in your expression.

First of all, you need to pass an expression to the function toBoolean(), and whatever you are passing is not acceptable. You have to pass something that is going to return something, and what you are passing is not going to do that.

Secondly, whatever you are passing is invalid. Since it’s not enclosed in quotes, it will try to evaluate this as a Math operator, and this is an invalid operation.

The other thing is, your expression is:
$column$ toBoolean(<something>)

I would not know how to explain this expression, and Knime does not know how to interpret this either. If you did this, it would be valid (but obviously might not be what you want to do):
toBoolean($column$)

If you showed us your error message, which you should do for future cases, we could show you why you are getting the error, as the message would give you a hint about what the problem is.

Thank you both! I played around with this RegexMatcher command. I dont have errors anymore. Will be attempting the Row Filter’s pattern matching next just to experiment.

1 Like

Thank you all, lots of answers, one of them helped me with the process!!

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.