How to predict a data in a future X year


someone can help me to find some nodes in knime or workflow to predict gdp data in a future year,
i have the gdp excel data of 2 countries in each year from 1960 to 2017, i can use only line chart to visualize the result and other nodes like group by and some operations.

i am a new user of knime, and please if someone have any idea about the prediction workflow of data in a future year

Thank you

I used an example from the KNIME Example server *1) to set up a workflow that demonstrates how this could be done. I am not yet satisfied with the graphics…

The principle is that you use a time window of historical GDP data (Spain) for a time window of eg 5-10 years and then keep the last 2 (or 1 or 3) years as an unknown (to be predicted).

I am not sure if you could use that to predict to the year 2048 that seems like a very far stretch. The models rely on historical GDP data from the years before.

XGBoost gives the best prediction for 2016 and 2017, but still the results are not very good. Ideas could be to add relative growth from year to year or adapt for inflation since the model will use absolute numbers.

cluster ^= GDP

*1) knime://EXAMPLES/04_Analytics/07_Time_Series/02_Example_for_Predicting_Time_Series

kn_example_gdp_prediction.knar (3.2 MB)

1 Like

Thank you very mush , i will try to understand that workflow

you can keep this file if you want :
GDP Beta.xlsx (10.9 KB)

1 Like

The workflow is one possible adaption of your question. Please be aware that it is very unlikely you will come up with some great insights. My assumption would be that GDP development is much more complicated than a simple line of numbers although since nations do not change that much a model using the last 5 years might not be that far off anyway, given nothing dramatic happens like the world economic crisis. One could think of models taking that into account

So this is more like an exercise how a simple time series might work. If the task is just to use the numbers you might normalise the data or put together several lines of data. Since the economies of Spain and Morocco would be close to each other you might benefit from using both of them.

And it would be interesting to see if really long time series actually benefit a model, since Spain underwent some major changes since the 1960 (democracy, joining the EU, world economic crisis) - or if it is better just to use the last 5 years and continue the line.

If you want a really long prediction line into the future in theory you could use a model and predict the next year and use that as an input for the next one. It might be interesting to see where that leaves you in the year 2048. But be aware: the simple model starts deviating from the truth within one year, so the effects might multiply.

Thank you for the data, I took the liberty of downloading the GDP numbers from the site of the world bank; the workflow for preparing them is also in the KNAR file (m_001_gdp_prepare_data). You might have a look if they are the same as your data. There are much more interesting figures and it might be worth to also put them into the model.

GDP Spain

GDP Morocco

Data for Morocco:
API_MAR_DS2_en_excel_v2_10225588.xls (1.5 MB)

1 Like

Yes absolutely right, to predict we will need the data that have in relation with the development of the GDP, the economic state, the crises, the democracy (the critical data which can influence the GDP), …
you have convinced me in this sense …

Frankly, it’s a small project in intelligent business, there are many softwares like Power BI, RapidMiner, …, but I chose Knime, and the goal of this project was to visualize charters, do operations on these excel data from Excel as well as make pridictions.

I thank you another time, you really helped and oriented me, because it’s just 3 weeks that I used knime, I became familiar with the noueds of operations, visualization, treatment, and I searched on youtube about prediction models, and I did not understand because of the nature of my excel data because they focus on a year input.

I will try as you said to predict the GDP from 2019 - 2020, focusing on each year spent as input

1 Like