Column Expression on two different laptops gives two different results

Hello,
I am running the identical workflow on identical data set on two different laptops. I have the following in my Column Expression node:
if (isMissing(column(“Status”)))
{“Error”}
else if (regexMatcher(column(“Status”),“Valid.*”)== ‘True’)
{“Valid”}
else
{“Invalid”}

On my laptop, it declares the Status as ‘Valid’; on the other machine it declares the status as ‘Invalid’. This is the only node that interprets the Status.

Column Expressions node is installed on both machines. Can it be something about regex itself? What else should I look at to see why the identical flow brings different results (I made sure we are running the same data).

Please help

How strange. Same versions of KNIME (what version?) Can you make a pared down workflow of a few nodes and a few rows of date, export that workflow (make sure to export it with data) and attach it to your reply?

I have updated my Knime but it is 3.7.5 and the other machine has the new version 4.0.
Can it be the Column Expressions are breaking in v4.0?

Anything is possible, but there are currently no regressions concerning this node reported in JIRA, so this is an unknown problem. Can you provide a test case workflow to examine?

It is a big and complex workflow, however the node is 142 at the bottom.
Run the first row only. The Status column should say ‘Valid’, not 'Invalid".

Commodity Tax Validator - QST.knwf (119.9 KB)

Yes, i feared you probably had a large complex workflow, and you do :- )

Your problem in 142 is that you’re treating the regex matcher function return as a String when it returns a boolean.
This: if (regexMatcher(column("Status"),"Valid.*")== 'True')
should be this: if (regexMatcher(column("Status"),"Valid.*"))

2 Likes

Haven’t tried it yet, but will.
However, why does it work on my computer without the change?

I’m surprised it works in 3.7.2 (i assumed you meant 3.7.2 - there was no 3.7 version after that) – perhaps there was sugar once added to the node so that people who didn’t observe the method return would still get working code?

(I verified that the fix i suggested works in 4.0.0 definitely let me know if you cannot get it to work.)

Trying in 3.7.2, i cannot get the regexMatcher to work correctly if i evaluate it as a String (i.e if i do ... == 'True')

Hi there!

not sure how it can work correctly if you evaluate it as a string…

Anyways I propose using TRUE() and FALSE() functions in Column Expressions node for evaluating Boolean values :wink:

Br,
Ivan

Correction: I was working with the version 3.6.2 and it did work. It is what it is.
Nevertheless, the change suggested by quaeler works in 3.6.2 as well. I am unable to validate it in v4.0 as of yet. Will confirm later today.

@IrynaK

I have actually been struggling with this same problem for some time. Regex works fine in Column Expressions in 3.7. But when I take the same Regex and rebuild the workflow in 3.6.2 it does not work at all. I have been wracking by brain to understand why. I don’t know if the Regex engines changed or if its something else. I am working on building an example workflow to demonstrate. But simply, if I run this formula:

if
(regexMatcher(column(“Workday Req Number”), “^\w{2}-\d{8}$”))
‘0’;
else
‘1’;

On a data set of string data

JR-62201212
JR-6107
JR-6102
JR-6048
JR-13100
JR-13075
JR-13065
JR-13055
JR-13007
JR-12960
JR-12939
JR-12937
JR-12916
JR-12911
JR-12854
JR-12849
JR-12765
JR-12762
JR-12696
JR-12636
JR-12606
JR-12565

It won’t work. Yet when I run the EXACT same expression in 3.7 and 4.0 it works no problem.

Yes, this was broken before, and fixed in, 3.7.1 - listed as AP-11019 here.

3 Likes

@quaeler

Thank you for the link! That makes total sense now!

2 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.