How to group time series?

ellboy · June 14, 2017, 2:58pm

I measure the current consumption of some node to identify the actions performed by this node. As the start point I use the measurements for this node without any actions and try to find the distance between the idle state and states with some actions. The data set I have so far:

?Time (s)	Voltage_nV	distance
2017-06-12T10:30:27.769	214685,93	25746,106
2017-06-12T10:30:30.701	5609723,62	5420783,796
2017-06-12T10:30:30.710	1325231,79	1136291,966
2017-06-12T10:30:30.721	518620,671	329680,847
2017-06-12T10:30:30.731	274424,737	85484,913
2017-06-12T10:30:30.741	265276,375	76336,551
2017-06-12T10:30:30.751	215287,108	26347,284
2017-06-12T10:30:30.762	228865,892	39926,068
2017-06-12T10:30:30.771	212372,701	23432,877
2017-06-12T10:30:30.802	201107,146	12167,322

2017-06-12T10:30:40.700	6843576,34	6654636,516
2017-06-12T10:30:40.710	1292846,59	1103906,766
2017-06-12T10:30:40.720	511380,396	322440,572
2017-06-12T10:30:40.730	344932,474	155992,65
2017-06-12T10:30:40.740	247776,864	58837,04
2017-06-12T10:30:40.753	256324,049	67384,225
2017-06-12T10:30:40.763	247071,134	58131,31
2017-06-12T10:30:40.774	219312,388	30372,564
2017-06-12T10:30:40.784	191854,231	2914,407
2017-06-12T10:30:40.804	208046,833	19107,009

As one can see some rows have very close timestamps . How can I group rows which are closer then 1 sec together? I will calculate the mean value of each group for classification and further processing.

RolandBurger · June 15, 2017, 5:04pm

Hi ellboy,

First, you need to convert your Timestamps to the KNIME Date and Time type using a String to Date/Time node.

Next, use a Date Field Extractor and a Time Field Extraxctor node to get one column for year, month, day, hour, minute, second.

Finally, you can use GroupBy with all these date/time columns as groups and aggregate the remaining columns for the mean.

Cheers,

Roland

ellboy · June 16, 2017, 11:05am

Hi RolandBurger,

actions are not synchronized to the time. That means they cannot be grouped just by the timestamp, because some ot them could start and end in adjacent seconds. What I need is to find which samples are close to each other with some threshold.

Regards,

ellboy

arbe · June 18, 2017, 10:55am

Hi ellboy,

In order to derive a new column suiting your grouping requirements, I recommend to sort by time, lag the time column by one, calculate the time difference between both, leave only the desired values marking a new block and filling the now missing values by taking over from previous row.

See attached workflow for this solution.

Best regards

Arne

forum.knwf

Iris · December 16, 2021, 8:05am

14 posts were split to a new topic: Changing Timeseries from 10 minute to hourly