ML techniques: which one can I use to predict sales in a particular country?

giad · November 17, 2020, 3:31pm

Hi!
I have a theoric question:
I need to predict the purchase for each country. I have a dataset with past data with this columns:

customer id
purchase data
country of purchasing
type of product purchased

Which is the best model to implement in Knime?
I have tried with random forest but when I set the random forest learner on the target column = country I have an error:
Execute failed: The target column does not have possible values assigned. Most likely it has too many different distinct values (learning an ID column?) Fix it by preprocessing the table using a “Domain Calculator”.

What is it?
Random forest is not the best model for what I am going to predict?
Many thanks

Giad

mlauber71 · November 17, 2020, 8:15pm

@giad this does not sound like you have very much data in order to do a prediction. In general if you want to predict the amount of purchases (money, volume etc.) this would be a regression problem, but I am not sure you have set up your problem yet in a sufficient way.

If you want to read about (regression) models I have compiled a collection and also there are several resources to educate yourself about machine learning.

giad · November 20, 2020, 8:47am

Many thanks
And for clustering algorithms? Which nodes could I use in Knime to implement them?

Thanks again
Giad

ipazin · November 20, 2020, 11:49am

Hello @giad,

from what I see error message you got suggest to use Domain Calculator node as you probably don’t have domain calculated for Country column. But still this doesn’t seem a way to go in case you are predicting a sales per country. As @mlauber71 said you have a regression problem and thus can’t really use classification algorithm (Random Forest) for it.

Regarding clustering there are DBSCAN, Hierarchical Clustering, k-Means and other… Typing clustering to Node Repository will give you more nodes while on KNIME Hub you can see workflow examples.

Br,
Ivan

mlauber71 · November 20, 2020, 12:14pm

as @ipazin said KNIME offers serveral such algorithms. Some generic ones; you might want to take a look at this example to see what this concept of clustering is about:

There is also a big collection of algorithms from the WEKA implementation in KNIME.

But I would strongly advise to think about waht kind of problem you want to solve and familiarize yourself with the concepts behind several machine learning techniques.

system · May 22, 2021, 12:14am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.