Diffrence between supervised learning and a simple algorithm

AdrienR · December 16, 2019, 3:11pm

Hello everybody,

I would like to build a workflow of supervised learning on Knime, but i still don’t see the difference between a supervised learning model and a simple algorithm with multiple of “if”. Let me explain to you by an example on binary classifier :

If you have data on the weather like temperature, humidity and windy. The two class are YES i play outside or NO i don’t play outside.
You give the rule on a sample for your data training and then you apply it on the data set.

How is it different from an algorithm that is:
If(Cloudy=no)
if(windy=no)
if(temperature >15°)
then YES
else NO

Thank you for your responses,

Best regards,

Adrien,

Snowy · December 16, 2019, 3:35pm

I think it’s easy to see the output of a machine learning model and say, “oh that’s just a series of if statements”. The key in understanding is knowing how the machine learning model came up with those results. Typically involving some form of mathematical minimization or maximization to get the optimum results. (E.g. how do you know that if(temperature > 15) = YES, is the correct number?)

Additionally, the if statements work well in your example because the only variables are categorical. However, if statements don’t work so nicely when you’re trying to calculate an odds ratio, or use continuous variables.

Hope that helps… Feel free to ask clarifying questions!

AdrienR · December 16, 2019, 3:44pm

Thank you Snowy for your answer.

I’m studying applied math so i see exactly what you said about the fact that machine learning is kind of optimize a function. But for the supervised learning model, the data scientist has to give the basic rules for the model. In the example, temperature over 15 is in the basic rules in the data training. so what i understand is that i give rules to the model and it reproduces the rules on the data set and i don’t see the difference with if statement.

Thank you,

Adrien,

Snowy · December 16, 2019, 3:54pm

For a supervised learning model, the data scientist does not give the rules. The data scientist only provides the dependent output, e.g. did the person go outside or not. (Typically denoted as a 1 or 0.) The supervised model then takes all other information (e.g. the exact temperature, the exact wind speed, whether or not it was raining, etc.) to create the “rules” or “breakpoints” you see in the output, through mathematical optimization.

To help clarify further, the “supervised” part of a supervised model, simply means the user provides the dependent variable. (In this model, whether or not the user went outside.) There is such thing as an “unsupervised” model, where the user simply provides the data, and the computer figures everything else out. (A lot of clustering algorithms are unsupervised.)

AdrienR · December 16, 2019, 3:59pm

Ok i just understood the point ! Thank you very much for your explanations !!

Adrien,

system · December 23, 2019, 3:59pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.