Remove "NULL" string values from data table

Greetings,

I have a data table where instead of blank or missing values there is the string value of “NULL”. I know that I can use the string manipulator or rule engine to replace the string NULL values and make them blank but this would be very tedious as this issue plagues my entire data table (which is over 30 columns). Is there anything I can do to transform all the instances of a NULL string value using one tool or method?

Thanks,
Chim

Hey @chimdee,

Here’s an example using the String Manipulation (Multi Column) node to perform a string manipulation task across multiple columns at once:

data normalization.knwf (6.7 KB)

image

You can use manual or wildcard selection inside the String Manipulation (Multi Column) node to pick which columns are processed. Alternatively, you could add a Column Filter node and utilize the “Type Selection” option to only process String columns (as an example).

Cheers,

@sjporter

2 Likes

Thank you so much for the response. I was not aware of the String Manipulator (multi column) tool. I have tried using the example that you posted but I keep getting the following error. Any advice you could provide to help with this error would be appreciated.

ERROR String Manipulation (Multi Column) 3:9 Execute failed: Invalid type identifier for variable in line 1: n

Hey @chimdee,

It looks like the node is complaining about the data type. Try wrapping $$CURRENTCOLUMN$$ with string() like so and see if it resolves the error:

toEmpty(strip(replace(string($$CURRENTCOLUMN$$), "NULL", "")))

Just keep in mind the formula above might convert a numeric column into a string column, in which case there are nodes available to convert it back.

Cheers,

@sjporter

3 Likes

Yes, I cleared the CurrentColumn reference and reinsert it and that did the trick! Thank you so much for your help!

3 Likes

No problem :slight_smile: I’m glad we were able to find a quick resolution.

Cheers,

@sjporter

Hi Again,

I’m running into an error that I can’t figure out how to clear. When I run this String Manipulation (Multi Column) tool in my workflow I get the following error:

ERROR at line 66
The method replace(java.lang.String, java.lang.String, java.lang.String) in the type org.knime.base.node.preproc.stringmanipulation.manipulator.ReplaceManipulator is not applicable for the arguments (java.lang.Integer, java.lang.String, java.lang.String)
Line : 65 public java.lang.String internalEvaluate() throws Abort {
Line : 66 return toEmpty(strip(replace(__variable_0, “NULL”,"")));

Any ideas of what I’m doing wrong? How I can clear this?

Best,

Chim

Hi @chimdee , did you convert the column to string by wrapping it with the STRING() function as @sjporter suggested?

If you did already and still getting this error, can you share the workflow and data?

2 Likes

That was it. I didn’t remove the non String columns. Silly me smh. Thank you so much!

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.