match empty string

Hi,

i'm searching a method to match the empty string into a row splitter.

With the java snippet row filter i can search for col.equals("") but with a simple row splitter i can't find the right way.

Thanks

1 Like

Are you sure they are empty strings and not null's ?

Sure, they match with col.equals("")

What about splitting the other way around (exclude<->include)? I think ?* matches everything that has at least one character, so that would be the opposite condition.

Hi. You can use the regular expression (^?) to match only empty strings. ^ matches the beginning of the string and ?matches the end. I found I had to enclose them in parenthesis in the Row Splitter dialog.

Don

1 Like

Thanks

Hi,

I was trying to match empty strings (no missing values but strings of length 0) with the Row Filter node and could not manage until I found this post. The regex ^? does indeed do the job, but, AFAIK, this is not a valid expression, as ? is a quantifier and there is nothing to apply it to… The proper character matching the end of the string is $. Therefore, the standard expression to match an empty string is ^$, which is what I tried to no avail.

Can someone comment on this, please?

Hi @mpenalver, I’m trying to remember what led me to this regex. You are correct about the regex. The reason your correct regex doesn’t work might be explained better here: Regex ^$ doesn't match empty string .

2 Likes

Thank you for the link, dnaki. Whatever the underlying reason is, the node does not have the behaviour a regular user would expect.

1 Like

The standard way to match multiline empty strings \A\z does in fact work. A great reference regarding regex: https://books.google.com/books/about/Regular_Expressions.html?id=adTOwAEACAAJ.

3 Likes

Knime, please fix that.

Depending on the use-case you could use String Manipulation nodes “toNull()” method which converts empty string to null/missing and then filter/split on missing values. Of course this only works if empty string and missing value are “equal” for your use case or you already dealt with the actual missing values beforehand.

Anyway I use this approach regularly. And I think it’s much more understandable than a regex anyway.

EDIT: And if you want regex, what work is checking for content vs checking for no content:

^.+$

This can separate empty string from strings with content as .+ requires at least 1 character to match.

2 Likes