XGBoost Predictor

About unbalanced data you might want to consider this article and the hints from KNIME team members from previous threads especially concerning SMOTE.

Then I added another balancing attempt with R and ROSE algorithm. although I am a little bit wary about using it. You might want to consider maybe not balancing your dataset but bring the minority group to 10% or something and take a look at AUC and other metrics not just the scorer that would consider everything above 0.5 as success.

Another attempt you could make is use some H2O nodes which offer you some balancing settings:

Also you might see what H2O AutoML would do with your data and if it could come up with some solutions. It also allows for balancing although I have never tried it:
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/balance_classes.html


Imbalanced Data : How to handle Imbalanced Classification Problems

SMOTE Hints from KNIME Team members

Try ROSE algorithm

5 Likes