Weighting of cases based on number of observation per case for PCA and t-SNE

Dear all,

My question comes from image analysis and concerns the data treatment issued from this analysis. So I have two groups of images in which I segmented a different number of objects (ie. image 1: 30 objects, Image 2 150 objects, etc). I obtained a list of variables from each of the object (i.e. area, perimeter, circularity, etc.) .
I would like to do a PCA and t-SNE analysis on these data. However, I think that the number of instances from each individual image will influence the result of the analysis. So I would like to asign a weight to all these instances before doing PCA.

How would you proceed?

Thank you for your help!

Hi @lstimmer,

I assume the statistics of the objects from each image are expected to be different for each image, otherwise a weighting would not be necessary, right?

I’m not all to much familiar with PCA/t-SNE, but could it make then sense to do the analysis on each image individually, and then compare the results? If the results are very similar for each image, you could argue that a weighting is not necessary at all and just use the whole dataset as it is.

Other that these 2 cents, I would refer to e.g. Python (maybe this repository?) to do the weighted PCA via a “Python Script” Node from within KNIME - or R if you are more familiar with that - since there are no KNIME native nodes to perform a weighted analysis of either method, so far.

Best regards,

1 Like

Hi LukasS,

Sorry for the late response, I was not able to respond earlier. Thank you for answer.
Yes, the statistics of all images are different, so we effectivelly need weighting. The main question is to compare treatment groups (with several individuals/images) and unfortunatelly (as usually in biology) the individual response vary a lot…

Yes, this is a tricky case, we have to build a R script for the weighted PCA.

Have a nice day,

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.