How to work on building a classifier?

I am new to KNIME. I am recently working on my school work on building a different classifier to predict something. I watched the tutorial video but so confused. In this task, I need to predict X has a salary > 50K through the ROC curve. I try to work with different nodes like a decision tree and random forest etc but I think I missed something that refuses to produce the prediction out. It will great that somebody can help and have a look at my workflow or would like to have some guide to work on my task.
ass3_predication.knwf (47.1 KB)

Hi @gameboy and welcome to the forum.

There are two common difficulties in producing a valid ROC curve - you must be sure to generate probabilities in the preceding predictor node (otherwise the ROC Curve doesn’t have anything useful to plot), and you must configure the ROC Curve itself properly.

Check out a sample workflow on the Hub here:

If you’re still having trouble, could you upload your training data as as well?

And if you don’t mind my asking, what class is this for? We’re always interested to know which university programs are actively teaching KNIME :slight_smile:

1 Like

Hi, thanks for replying. I think I have a problem with data preprocessing part and not sure how to generate the predictive as my model cannot predict anything. I read my training data on the first step.
It is a data analytics fundamental class and we can choose any method like using R and python.

It’s hard for me to tell without your data, so I loaded my own data into your workflow. I noticed in the Column Filter and Random Forest learner nodes I see a lot of columns in the red exclude box, which is going to remove them from your analysis. Maybe this is your problem?

As I said before, if you upload your training data maybe we can get a better picture of what is going on. Otherwise it’s just guessing :slight_smile:

1 Like

It seems you are using the census income dataset. If you want to learn more about KNIME and especially machine learning I compiled a few links in this entry - the dataset is also used in one example.

Also in the next entry @ipazin mentions the free udemy course on KNIME which also has a machine learning part:

Then if you are free to use any method like R or Python you could still use KNIME as a wrapper and to evaluate the results in a standardized way.


This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.