Missing Value for different datatype columns

Hi everyone,
I have a table with about 100 columns of various different data types, e.g. numeric, string, datetime… and want to replace all rows valued N/A, NULL, Missing by missing value (which is denoted as ‘?’ in KNIME or generated by missingValue().

Could you please give me some hints how to do so?
Thank you so much
Hanh

Hi @HanhDo,

Not sure I can follow, but I think you can use Missing Value node for handling missing values.
I can help you more if you give me your data or example.

GL,
Mehrdad

Hi @mehrdad_bgh
so my original table is like this with “Null”, “Missing”, “N/A”, “NA” are mistakenly used to denote missing value (?).

Name DOB TimeStamp Values Business Areas LINK
A 18/12/1992 3/11/2021 9:19 40493.05 21600 https://www.nytimes.com/
B NA 3/11/2021 9:19 41250.435 21601 Missing
C 11/12/1993 3/11/2021 9:19 40493.89 21602 https://www.nytimes.com/
D 22/1/1994 N/A NULL ? ?

Now I want all those to be converted to something like this while keeping the data type of columns unchanged.
? is the red question mark in KNIME also for ‘missing value’

Name DOB TimeStamp Values Business Areas LINK
A 18/12/1992 3/11/2021 9:19 40493.05 21600 https://www.nytimes.com/
B ? 3/11/2021 9:19 41250.435 21601 ?
C 11/12/1993 3/11/2021 9:19 40493.89 21602 https://www.nytimes.com/
D 22/1/1994 ? ? ? ?

Thank you

Hello @HanhDo,

considering you have multiple column types in your data set I would go with unpivoting, apply logic (Rule Engine), Pivoting approach.
See here workflow example:

Br,
Ivan

1 Like

hi @ipazin
Thank you for your recommendation, but then how can I replace those values with the red ? - which is also a “missing value” in KNIME?
I noticed that missingValue() function in Column expression node can convert value to this ? but then I need to specify data type for each column and it is kinda impossible for 100 columns of various data types :frowning:

Capture

Hanh

Hello @HanhDo,

Rule Engine outputs missing value if no rule matched cell value so following expression will work:

NOT $ColumnValues$ IN ("NA","N/A") => $ColumnValues$

ColumnValues is the name of column with all values after unpivoting and you can add other patterns in IN clause.

However now thinking only in string columns you can have “NA”, “N/A”, “Missing” and other patterns so maybe you are able to use String Manipulation (Multi Column) node together with following function:
toNull( regexReplace( toEmpty(string($$CURRENTCOLUMN$$)), "NA|N/A|Missing" , "") )

Br,
Ivan

3 Likes

awn, ok, I got it, Thankss so muchhh @ipazin <3
Best
Hanh

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.