[BUG] Join Node in Mac. KNIME version 5.1.2. OS Mac Ventura 13.6

Hi,

I am using 5.1.2 version of Mac OS version. There has been some instability with the calculations and this concerns me.

This is just one I spotted when using joining node.

I am trying to do simple join of two list of strings. first list is from KNIME table Creator node and another one is from Excel. But it seems like KNIME join node is not able to match the two lists.

Could someone please kindly look into this ?

KNIME Join on Mac Version 5.1.2.knwf (12.3 KB)
Book2.xlsx (10.5 KB)

Second issue I had with the the Mac version is that…

when I have a table ran in Python, this table contains two columns of integers. But when I run using python script. even if I turn the columns type into int within python code in KNIME python node. The resulting KNIME table returns these two columns as Long data type. This never happened in Windows version but is happening in Mac version.

Please help.

Regards,

Hi @KNIMEuser23421 ,

Thanks for reporting this. This does sound very concerning, as one expects the nodes to behave the same on all OSes.

Luckily, I can reproduce the issue you have on Windows as well. It looks like the entries in the Table Creator have an invisible tailing character, which may have snuck in due to a copy-paste operation.

For the Python issue, it would be great to have a small reproduction workflow to make sure we’re talking about the same issue. :slight_smile:

Kind regards
Marvin

2 Likes

Thanks for reply.

Regarding the python problem on Mac. Please see these files and screen shots.

KNIME Python Int Type example.knwf (9.1 KB)
Sample.csv (1.1 MB)

Regarding the join node issue. If I can spot that invisible tailing character, then I am ok. I can modify the strings to match. but now the issue is that we can’t spot the difference… hmmm

Hi @KNIMEuser23421,

Thanks for the super quick update. This is indeed odd. I can confirm this behavior on a Mac and interestingly this does not occur on a Windows machine.

I’ll forward this to our development team to investigate.

On the whitespaces… I guess they are tricky to spot. One can notice by moving the cursor to the very right and then deleting a character, but it may be simpler to either do this outside of KNIME, or if you want to automate this use e.g. the String Manipulation node (specifically the strip method in the node). We’re also introducing a nicer String Cleaner node with KNIME 5.2. :slight_smile:

Kind regards
Marvin

3 Likes

Thanks Marvin! Really appreciated !!!

Hi @KNIMEuser23421,

It looks like the integer mapping is a quirk of Python:

In Python, the int type is not necessarily a 32 bit integer and may hence be represented by a different data type in Java.

If you want to make sure you’re working with 32 bit integers (which would also be integers in Java), you can use the numpy.int32 type.

Kind regards
Marvin

4 Likes

Hi Marvin,

Thanks for this. Yes. Indeed. Before I was using just astype(int). by default under Mac. it uses int64. now I am using int32 instead of Int. so problem solved!!

Thanks!!

2 Likes

This Join is still giving me trouble.

Under Mac,

if I generate unique numbers using value counter node and copy these into table Creator node and do join. it will work.

However, if I copy from value counter node → Excel → to Table Creator node …the join will not work despite it shows identical strings.

Please help me investigate this. under this, users are not able to copy and paste from excel freely.

Hi @KNIMEuser23421 ,

I presume this will again be a case where some special characters sneak in. A general problem when copying (auto-) formatted data. Without knowing the data it is hard to suggest a concrete remedy, but here a few general things that can help:

  • Pasting via Ctrl + Shift + V (I presume Command + Shift + V on mac) will paste clipboard content without formatting.
  • Investigating the special/invisible characters with an editor. These could e.g. be line endings, which one could configure or disable in the tool (here probably Excel) one is copying into or from.
  • Performing a more resilient match, e.g. counting entries as equal, even if they differ slightly, or only checking that e.g. the first x characters match. I haven’t tested this workflow, but it looks like something that could be a pointer into the right direction:

And again, the String Manipulation node or the more convenient String Cleaner node (coming with KNIME 5.2 later this year) may be able to remove any undesired symbols.

Kind regards
Marvin

1 Like