Hi everyone,
I am going to analyze a huge dataset of 2 millions rows and 7 columns (variables) - this is the dataset -. Since this dataset is too huge to open with Ms Excel, I opened it with Knime.
One of my firsts questions aims to sum the listenings on each region (Italy, France, Uk, Etc...). I thought two ways to proceed:
1) Divide the original .csv file like the number of regions -> Obtaining 53 new .csv files (number of regions in the original dataset). Then, proceed to the analysis on Excel.
or, if I want to work on the original spreadsheet with Knime:
2) Sum streams must be associated with regions (country) and the stream count must reset everytime the region change.
So, I don't know how to proceed and which operators in Knime are able to do this. (and I don't want to divide manually the dataset!). I would be greatful if someone could help me in this work (which represents a part of my final dissertation).
Thank you very much
Kind regards
Alberto