If I remove all zero rows from training set (I'm dealing with a multiclass problem) will I have improvements in performance?

If I remove all zero rows from training set (I’m dealing with a multiclass problem) will I have improvements in performance?

Hello @Rubik,

what do you mean by all zero rows?
Do these rows only contain zero values and are there many of them?
If those rows make up a large part of your dataset, then removing them could allow your model to “focus” more on your other rows.
Note, however, that this might not necessary improve performance in terms of accuracy.
If you consider a decision tree, it is easy for it to separate those zero rows (although it might result in a very deep tree if no pruning is applied), hence removing them will only result in a different tree (which is likely smaller) but the accuracy on the non-zero rows should be the same.

Kind regards,

Adrian

1 Like