I have very large timeseries on an Hadoop Cluster. These are Measurement with e.g 100 Hz Sample rate. I’d like to retrieve with hive this Time Series with a smaller sample rate such as 1 Hz. So I’d like to have every 100th row, and it must happen on the Cluster because the whole timeseries is too large for a download and local downsampling.
Do you have a suggestion how can I do it in KNIME ?
I see that there are suggestion for a SQL Query on Stack Overflow but I’m not good in SQL and I have also the problem that I don’t see the rowID column on my DB Query node. Shall I use the ROW_NUMBER function ?