weighting cases


Is there a way to weight cases based upon the values of a column?



Hi Kees,

in which scenario do you want to weight? In a learning algorithm?

Best, Iris

Hi Iris,

I would use it for instance for making a dataset better resemble the population it was drawn from.

Sometimes a dataset comes with an attribute for such use.

Compare with SPSS: 'weight cases by variable'.




Sounds like sampling weights. The equivalent would be to replicate the observations according to the weight column.

For example, a weight of 2 would be the same as having the observation twice in the dataset. This could be achieved with One Row to Many, or Concatenate (together with a loop), or Cross-joiner.

Obviously this requires integer weights. In case of float weights, simply multiply the weights by e.g. 10 / 100 / 1000 / etc. so that they become integer.

Thanks for your feedback Geo.

Adding cases is not a sufficient solution for me.

The weight is not necessarily an integer, plus, I prefer my n not to augment.

Perhaps R or Python can help.

Thanks, Kees


Hi, I have the same problem, I want to add weighting to a model but would rather not add rows. I read a subsequent post that mentioned an R snippet, I don’t know R so wondered if a more user friendly alternative had been created in the 2 years following that post? :woman_shrugging:t2:

Any updates to this thread? Is there now an option to weight data points in the loss functions used by Knime?

Hi @MattiasMarder,

maybe check you this topic about weights: misclassification costs to adjust for rare outcome - #6 by goodvirus

Best regards,