Replacing ? with 1 and empty column with 1 and rest all with 0s with Column Expression

Helo Knime community,

With column expression, I want to write a code, in such a away to convert all “?” and empty rows with 1
and rest with all 0s

Can anybody help me out, how to write the expression. thankyou

Hi @Muhammad44 , is there any reason why you want/need to use the Column Expressions?

You can use the Rule Engine node to do this:

MISSING $PostalCode$ => 0
TRUE => 1
3 Likes

@bruno29a yeah because in rule engine, the code will be longer for 18 columns whereas in column expression, I have to copied the code with add expressions. I have write the code something like this, but doesnt work
if (column(“City”) == 1)
{
missingValue()
}
else
{
column(“City”)
}

@bruno29a there are 18 columns so I want to do it for all columns, and there are rows in “address and city column” which doesnt have ? but are empty, so want to change for that as well.

Could you attach a sample of your data? I’m not sure how you have blank value (are those spaces?). Is so, perhaps transform all missing values to spaces (blank values) and then replace all blank values with 0. Then replace all non-0 with 1. I’m currently investigating the most efficient way to do this while I await your sample. Thanks

Agree with @victor_palacios . Have you tried missing value node before? br

missing or empty replacement.knar.knwf (16.7 KB)

1 Like

Hi @Muhammad44 , the Column Expressions would not help for the fact that you have 18 columns to process. You would still need to create 1 expression for each column.

Your best bet is probably to use a Loop like what @victor_palacios is suggesting, or without a loop, using the String Manipulation (Multi Column) and the Missing Value node.

Also, I noticed in the title that you actually want the missing values to be replaced with 1, not 0. So actually my original expression should have been the other way:

MISSING $PostalCode$ => 1
TRUE => 0

So, here’s a solution that handles multiple columns and without loop:
image

Input data:
image

Results:
image

I processed the columns Address, City, Country, PostalCode and County. The empty cells of these columns have been replaced by 1, and the non-empty ones by 0 as per your title.

Just explaining a bit what I did:

I highlighted 4 items to look at in the String Manipulation (Multi Column):

  1. You select the columns that you want to process
  2. I used the expression toInt(substr(join("0",$$CURRENTCOLUMN$$), 0, 1)) (more on this further down)
  3. I uncheck the box “Insert missing values as null” (more on this further down)
  4. I replace the values of the columns

When you uncheck the box “Insert missing values as null”, any join() that you do with an empty cell will result in an empty result, so based on this, I join “0” to all the cells of the selected columns, and the “0” is added at the beginning. This is what this part of the expression does join("0",$$CURRENTCOLUMN$$), and this is what the modified data would look like:
image

The empty cells remain empty, and the non-empty cells now have “0” as first character. Similarly, doing a substr() on an empty cell will result in an empty result, meaning that I can “safely” extract the first character of all cells using substr(), which will extract the “0” of the non-empty cells, and keep the empty cells empty. This is what this expression does substr(join("0",$$CURRENTCOLUMN$$), 0, 1)

Now, you can see that I also converted the results to Int (Integer) with the toInt() function. The reason for this is to be able to use the Missing Value node in a more general way, where I can make the node apply the change to all Int column.

This is what the data looks like before doing the Missing Value:
image

I can now apply the replacement of missing values with “1” by applying the modification to all Int columns like this:
image

Of course, you also have the option of choosing each column in the Missing Value node, in which case you would not need to convert the columns to Int:

This way, of course, takes a bit longer to set up.

Here’s the workflow: Replace missing values with 1 and non-missing with 0.knwf (9.3 KB)

3 Likes

@bruno29a , Wow! Those are some really good workarounds to help automate this task. Thanks!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.