Multi polynomial regression model variables

Hi KNIME Members,

I am struggling with my model. I have following datasets: Data.csv, temperature.csv and holiday.csv. I will upload these datasets within the zip file.Three datasets.zip (64.1 KB)

Dataset 1(“data”): Contains the electricity load each 30 minutes in years 2014 and 2015. Each row corresponds to the data of one day. Columns 1, 2 and 3 indicate the year, month and day of the measurements, respectively. The remaining columns show the electricity loads in 30 minutes blocks on that day. More specifically, T1 denotes the load of period 00:00-00:30, T2 00:30-01:00, and so on, and T48 is the load of the last 30 minutes of the day, i.e. period 23:30-00:00.

Dataset 2 (“Temperature”): Contains the average daily temperatures from 2012 to In the data file, the first three columns contain the information of year, month, and day, respectively. The last column indicates the average daily temperature in °C.

Dataset 3 (“Holidays”): Contains dates (in year-month-day format) of public holidays of years 2014, 2015 and 2016.

We used the following KNIME model. KNIME_Model.knwf (99.9 KB)

The dataset out of the column aggregator(Data 2014-2015 w/ max elec. value), also has a column with the value 0 or 1 depending on whether the date is a holiday or not. We use a polynomial regression learner and predictor. In this learner we want to include the variables holiday and temperature to predict the maximum electricity load per day. The problem here is that the we want a second degree polynomial regression learner, that includes these two variables. But we get the following error: ERROR Polynomial Regression Learner 0:56 Execute failed: Index: 2, Size: 2.

How can we overcome this error but also include both variables.

Is it best to only include the variable temperature, based on the outcome, multiply the predicted value of a holiday with a certain (known) factor. Or is it better to split the data in a holiday and non-holiday dataset and use two separate learners to arrive at the predicted values? Or do you have another suggestion on how to solve this problem?

Thank you in advance!

Hello @StudentEt,

this is a weird one, it seems the Polynomial Regression Learner can’t deal with 0-1 variables.
We’ll look into this problem but in the meantime could try to convert the holiday feature to string, so that it isn’t treated as numerical feature but as a nominal one.

Please let me know if this helps.

Cheers,
Adrian

1 Like

Is it true that there is no possible way to add two variables to the polynomial regression learner (when having two degrees), since the model does work when we change the degree to 1(In this case we are able to run the model).

Sorry, I have to apologize, my described workaround does not work because the Polynomial Regression Learner currently doesn’t support nominal columns.
The degree should not make a difference in your case.
There are cases where a too high degree can be problematic i.e. if you have only very few rows.

An alternative workaround is to not use 1 as an indicator for the presence of a holiday but instead a number close to 1 e.g. 0.99. This can e.g. achieved with the Math Formula node.

Sorry for the inconvenience, we’ll look into this weird bug.

Cheers,
Adrian

1 Like

Hello @StudentEt,

this was fixed with KNIME version 4.1.3.

Br,
Ivan

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.