Missing Value Identification

Hello All,

I am trying to do the following and I got stuck
It would be great if any of you could help me.

Column A | Column B
a | 123
a | 234
a | 345
b | 123
b | 234
b | 345
c | 234
c | 345
d | 345

Let us say we have a table as above and I am trying to get a output table as below.

Column A | Missing Value Count | Missing Value
a | 0 |
b | 0 |
c | 1 | 123
d | 2 | 123, 234

I was able to get the missing value counts from column aggregator by first using groupby (column A) then split collection column of column B.

Could you please help? Thank you.

Hi @yjoung1,

I don 't get your question. You wrote that you were able to get something. So what is it that you are asking for then?

Best
Mike

Sorry if it were difficult to understand.

I was able to get the Missing Value Count, but couldn’t get the “Missing Value” as the column aggregator only has Missing Value Count option.
I have tried using concatenate option and manipulate string afterwards, but not successful as replacechar doesn’t recognize the string in ONE cell one by one.
e.g., replacechar(“123, 234, 345”, “123, 234”, “”) -> my expected result is “345”, but it returns (“5”)

So I assume column aggregator is not the way to go. Perhaps, there is another way around.

Hi,

Instead of using column aggregator use the “GroupBy” node.

1 Like

Thank you for the input.

Unfortunately, “GroupBy” also does not recognize what the missing values are.
It only come with “Missing Value Count”.

Hi,

Try to recognize the missing values with “Rule engine” node.

In your first post there is no missing data … can you upload some (dummy) data.

Thank you

1 Like

Thank you for being patient with me.

here is the dummy data that looks better than the original post.
image

After “Group By”, I get this (here I grouped by column A and returned a list)
image

After this I splitted the column so that I can count the number of missing values using column aggregator.
image

Now the question is how do I identify the “?” of “c” and “d” and display like this.
image

Hi,

Here is one idea/solution … maybe is not the shortest and will not cover all your cases but you can work on it.

Missing.knwf (35.6 KB)

Hope this helps you.

1 Like

@yjoung1

The issue people are having with your question is that a “Missing value” in KNIME has special meaning. “Missing value” in KNIME means null or “no value”.

However you are not interested in missing values as per knime definition, right? You have a set of values in column B and want to see which of these values never occur with value in column A.

If that is true please see my convoluted solution. The advantage over andrejz is that the values don’t need to be hard coded. it will work with any values. However the exact column names will need to be adjusted in the node configurations.

Missing Values forum.knwf (35.2 KB)

The logic is:

  1. Determine list of available values (groupby with no group)
  2. Join the available values back to each group (group loop start, outer join)
    Values that are missing will have a ? (Missing value as per KNIME definition) in column1. We can now group and filter by that
  3. In case there are no missing values (empty table switch) we simply “join” back original values with concatenate node and hard code the results (no missing values and missing value count = 0).
  4. rename result columns

image

4 Likes

Here are three examples how to count missing values. The GroupBy is a perfect match due to “Missing value count” aggregation method.

Best
Mike

He doesn’t actually want to count what KNIME calls Missing values. Read his post (or mine) again.

1 Like

Hello there!

Really a bit of confusion made with terminology used :smiley:

Here is my solution playing with Pivoting and Unpivoting node.
Missing_ipazin.knwf (31.7 KB)

Welcome to KNIME Community @yjoung1!

Br,
Ivan

1 Like

Thank you everyone who have contributed for this post!

I wasn’t really able to come up with a WORD that best describes “Missing Value” other than what KNIME refers to as “Missing Value”.

I was able to solve my problem with the temporary solution from @andrejz.
Thank you.

But I would like to give a credit to @beginner who have given the solution that works the best in similar cases.

I am always fascinated by how people solve a problem is such a different way!
Some have similar logic as me, some don’t.

Thank you again and take care!

2 Likes