exam based Kaggle case study

Hi Everyone,

I’ll have an exam based on this case study: Costa Rican Household Poverty Level Prediction | Kaggle

In the exam less data will be provided: about 10 attributes [columns] and about 1000 data objects [rows]. Some attributes are slightly different encoded.

However, the objective is the same: Try to identify households, which live in poverty and need social welfare assistance.

As I’m stuck and don’t know how to start exactly.
Basically, my confusion comes - on which nodes exactly to use, which model (KNN, Decision Tree, etc…)
Also, when and how to use the training and test data?
If someone can give some hints on what nodes and and models to use in: data preparation, evaluation, modeling and deployment will be great.
Any help is highly appreciated!

@Everfresh000 welcome to the KNIME forum. As a basic remark: since this is an exam the topics should be covered in the course. But anyway…

If you want to start about learning KNIME and machine-learning I would recommend to take these two courses on the KNIME Learning platform https://knime.learnupon.com/.

  • [L1-DS] KNIME Analytics Platform for Data Scientists: Basics
  • [L2-DS] KNIME Analytics Platform for Data Scientists: Advanced

(Examples and workflows for courses are provided here: knime/Education – Courses – KNIME Hub)

Then there is this (free) book covering a lot of topics about machine learning

Then since the task is from a Kaggle competition it might make sense to read the discussions about the case and see what other people have noticed as ideas and possible problems with the task.

If you want to explore further topics about machine learning (also with examples) you can take a look here:


This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.