Row Filter - Issues Filtering Rows with Special Character

Hi All,

THank you all in advance for your help, this one has been bugging me for a while now and I can't figure it out. 

Simply, I need to filter out rows that have a parentheses anywhere in a specific column. 

It's a data table with a few hundred thousand rows and there's a lot of noise in the data.  There's a specific column that serves as the row identifier (let's call this column B). 

Column B has a long string, containing letters, numbers, and parentheses.  However, any row that contains any parenthesis in Column B is invalid and I want to remove it from the data set. 

 

My issue is that any rule that I try to write keeps viewing the parenthesis as part of the syntax, and not the character I want to filter for. 

I've used the Row Filter, Rule-Based Row Filter, Rule Engine and I can't figure it out. 

I'm at a loss.  Any suggestions?  I am not a coder at all so I have no base knowledge of code.  I just try and learn the syntax of the specific functions as I go along. 

Thanks!

Camilo

You need to do some testing but I configured a row splitter to use pattern matching, checked regular expression and excluded based on the following regex: .*\(.*|.*\).* 

So I think this is exclude if a ( or ) is found anywhere in the string. The backslash is an excape charachter identifying that the brackets are not part of the regex notation but the charachter to match. 

 

2 Likes

Swebb, THANK YOU!!!

I'm not entirely clear how it worked, but it did.  To be specific, here's the argument I used (excluding true matches):

$Column B$ MATCHES ".*\(.*|.*\).*" => TRUE

Now is there somewhere that clearly explains the syntax for asteriks, periods, backslashes, parenthesis, etc?  I found that the node descriptions were too vague and lacking in examples. 

Thanks again,

Camilo

 

 

http://en.wikipedia.org/wiki/Regular_expression 

http://www.regexr.com/

I think you can read the expression I gave as:

 

any number (the dot) of any character (the asterisk) then a ( charachter (using the slash to indicate it should condisder it a charachter) then any number of any charachter after

OR

any number (the dot) of any character (the asterisk) then a ) charachter (using the slash to indicate it should condisder it a charachter) then any number of any charachter after

 

Regex is what you want to read up on.

Cheers

Sam

1 Like