Problems with newlines using LIKE in Rule engine

biancio · January 29, 2015, 11:06am

Hi Knimers,

In the attached workflow is an example of what is to me a string behaviour of the LIKE rule in the rule engine (in this case the rule-based row filter) :

Let's read a XLS file, in which there is one column "Col1" with strings in, some of these strings with newlines.

If I filter : $Col0$ LIKE "An example*" => TRUE

Only the columns without a newline after the "An example" substring are TRUE.

I need to removeChars($Col0$,"\n" ) in a String Manipulation node before filtering to get all rows starting with "An example" to be filtered correctly, see attached protocol and XLS file.

Is it the expected behivour for the LIKE rule?

If it is the case, is there a way to specify a wildcard that would include everything including newlines to be matched?

Best,

Marc

Marlin · January 29, 2015, 12:14pm

Hi Marc,

I don't know if this behaviour is "correct", but LIKE has simplified behaviour compared to the much more powerful MATCHES. The latter offers the full power of regular expressions. Applying it to "An example.*" should work right out of the box, but if not, you can adjust a lot of things with flags on the fly.

Hope that helps.

biancio · January 29, 2015, 2:29pm

Hi Marlin,

Thanks for your feedback. Actually I can reproduce the behaviour with MATCHES :

- $Col0$ MATCHES "An example.*" => TRUE gives 2 rows

- $Col0$ MATCHES "An example\n*.*" => TRUE gives 3 rows

In the java regexp link you provide, it is stated that "The regular expression . matches any character except a line terminator unless the DOTALL flag is specified." This behaviour could be related to that.

The DOTALL can be added in a regexp by specifying (?s) at the beginning of the regexp, so that

- $Col0$ MATCHES "(?s)An example.*" => TRUE gives 3 rows

Still, I would be curious to know how to set this flag globally.

Marlin · January 29, 2015, 3:34pm

Oh, ok, then didn't remembered it correctly, sorry about that.

I don't think there's a way to set flags like this globally. Maybe within a Java Snippet, and only for it's context... but that would probably be overkill for just a little flag.