Var1 and var2 should be consistent 1 to 1. So whenever you have a value in var1 you should have the correspondant value in Var2.
It’s more clear if you visualize.
The problem is that I have thousands of different values.
So I can’t create a rule engine for these specific cases, I need a rule that says something like: if Var2 is missing, then fill with the value of Var2 when same Var1.
Since the empty rows can be anyhere and a next/previous row logic wouldn’t not be very solid, I would opt to go for a group loop based on Var1. Then you can, per group, supplement the nulls with the most frequent value of Var2.
@RoyBatty296 ,
[Edit: Just saw @ArjenEX 's post. This is basically what is mentioned as the alternative)
If you always have at least one correct value for each item, and you are able to order them by item, with the “present” value appearing first in each case then you could just use the Missing Value node by itself and specify “previous value” as the population method
Great approach as well. Indeed falls with the proper sorting to make sure that all nulls are listed below the available values per group in case the first value of the group is null.
I would go for dictionary table creation (using GroupBy or Duplicate Row Filter node) followed by Cell Replacer to bring values in new column and finish it with Column Filter node.
Approach with loop and missing value could be slow in case of large data volume and sort plus missing value approach assumes there is no group with all missing values.