In the exam less data will be provided: about 10 attributes [columns] and about 1000 data objects [rows]. Some attributes are slightly different encoded.
However, the objective is the same: Try to identify households, which live in poverty and need social welfare assistance.
As I’m stuck and don’t know how to start exactly.
Basically, my confusion comes - on which nodes exactly to use, which model (KNN, Decision Tree, etc…)
Also, when and how to use the training and test data?
If someone can give some hints on what nodes and and models to use in: data preparation, evaluation, modeling and deployment will be great.
Any help is highly appreciated!
Then there is this (free) book covering a lot of topics about machine learning
Then since the task is from a Kaggle competition it might make sense to read the discussions about the case and see what other people have noticed as ideas and possible problems with the task.
If you want to explore further topics about machine learning (also with examples) you can take a look here: