I’m implementing some artificial intelligence algorithm in Knime (linear regression, MLP, random forest and so on).
An example of workflow used is this:
I have a problem with denormalization node: when predicted values are denormalized it seems the values are not in the range of original values, for example: if the value to predict is 1733, the predicted value after denormalization is 0.173…you can see the results in the pic above:
I expect the predicted value is in the same range of original values…It seems normalization or denormalization they not working correctly!
Can someone help me to understand if there is an error or a problem?
Thanks.
Maybe the following workflow helps to clarify how normalization/denormalization should be applied to get coherent predicted values with respect to expected values:
Thank you guys for your replies. @Daniel_Weikert I normalized only the target variable. I have only two variables: timestamp and target variable y. @aworker no problem if the regression linked it is in french…It seems the approach proposed it is similar to the mine.
I try to attach a simple workflow similar to the first…If you can run it and you let me know if there are some problem. Test_normalization.knwf (33.1 KB)
Thank you for your sugguestion! @aworker your solution solve the problem…emh…some explanation: you create a dummy variable Prediction y, y and Prediction y are normalized…before the Learner this dummy variable is eliminated…I don’t understand the reason…can you explain?
Thanks.
As mentioned by @Daniel_Weikert, the predicted y variable does not need to be normalized. However, If you want to normalize it as in your case, the y variable and its predicted values “prediction (y)” would need both to be denormalized later. For that, the -Denormalizer- node needs to know of a variable “prediction (y)” and the only way is to preliminary create a fake “prediction (y)” variable within the same range of values as the y variable. This is why the -Math Formula- node is needed. However, this fake “prediction (y)” variable should not be propagated to the -Linear Regression Learner- node and this is why the -Column Splitter- is needed.
As said before, normalization is not usually needed for predicted variables but if for any reason, you really want to do it as in your case, this is a possible solution.