weighting cases

Kees_Schippers · January 26, 2017, 9:37am

Hi,

Is there a way to weight cases based upon the values of a column?

Thanks,

Kees

Iris · January 26, 2017, 9:40am

Hi Kees,

in which scenario do you want to weight? In a learning algorithm?

Best, Iris

Kees_Schippers · January 28, 2017, 2:46pm

Hi Iris,

I would use it for instance for making a dataset better resemble the population it was drawn from.

Sometimes a dataset comes with an attribute for such use.

Compare with SPSS: 'weight cases by variable'.

Thanks,

Kees

Geo · January 28, 2017, 3:17pm

Sounds like sampling weights. The equivalent would be to replicate the observations according to the weight column.

For example, a weight of 2 would be the same as having the observation twice in the dataset. This could be achieved with One Row to Many, or Concatenate (together with a loop), or Cross-joiner.

Obviously this requires integer weights. In case of float weights, simply multiply the weights by e.g. 10 / 100 / 1000 / etc. so that they become integer.

Kees_Schippers · February 2, 2017, 4:59pm

Thanks for your feedback Geo.

Adding cases is not a sufficient solution for me.

The weight is not necessarily an integer, plus, I prefer my n not to augment.

Perhaps R or Python can help.

Thanks, Kees

marisamurton · October 13, 2020, 11:27am

Hi, I have the same problem, I want to add weighting to a model but would rather not add rows. I read a subsequent post that mentioned an R snippet, I don’t know R so wondered if a more user friendly alternative had been created in the 2 years following that post?

MattiasMarder · March 14, 2022, 4:20am

Hi,
Any updates to this thread? Is there now an option to weight data points in the loss functions used by Knime?

goodvirus · March 14, 2022, 6:02am

Hi @MattiasMarder,

maybe check you this topic about weights: misclassification costs to adjust for rare outcome - #6 by goodvirus

Best regards,

Paul