I have a problem with dynamic range in the GroupBy Node. I’m analyzing time series data. With a Rule Engine Node is mark some points of interest. The marks are randomized due to human interaction. What I want to group ist the event “to P” an aggregate the duration fo this event by sum of a column comming out of Date&Time Difference Node and Distance to Number Node.
To be able to group time series data I found the Exctract Date/Time Fields Node in the KNIME forum . So, because of my time series data has a time step of 10ms I thought to group by minutes and seconds can handel my group by problem but it don’t
Because of the randomized user interaction my marks I have to analyze can happen between to seconds shown in the next scrennshot.
The sconds is changing from 54 to 55. This means that my result looks like this:
The correct result I need is the sum of “Scond” colum 0.28 + 0.9 = 0.37. To rise up the granularity dosn’t solve the problem. Then I get the sum of every row
Has anyone of you an idea how to confugrate the GroupBy Node to get the sum of the “Second” column each time the “DSM…” column has the mark “to P” (or “to D”, “to R” etc.)
Hi @Brotfahrer , I don’t think I fully understand how you want your groupings to work, but maybe that’s because I can’t see the wider data set.
In your first screenshot, you are taking results every 10ms, and so, as you have said, the “seconds” value of your data will in some cases differ.
But what I am unsure of is, ignoring KNIME for the moment, what the actual basis of your groupings is supposed to be.
From your second screen shot it looks like you want data grouped if the difference in seconds between data points is 1. But what I don’t understand is if you are taking readings every 10ms, then you might have data showing readings of:
row DSM Hour Min Sec
1 to N 13 51 52
2 to P 13 51 50
3 to P 13 51 54
4 to P 13 51 55
5 to P 13 51 56
(ignore colours. I don’t know why the forum s/w has coloured this!)
So what would your grouping be then, since row 4 is within a second of both row 3 and row 5, although rows 3 and 5 wouldn’t be considered grouped with each other because they are 2 seconds apart?
Maybe your data set doesn’t work like this, but I think I would need to understand better what actually defines a grouping before I can assist with how to do it with KNIME.
Sorry if I am simply mis-reading or misunderstanding what you have said.
As you can see I have “up’s” and “down’s” in my data. The time of these events varies.
If I configure the GroupBy Node only by the “DSM” column I will get the the hole sum of time for my “to P” event but I need it for every “up” like in the diagram t_1 = x sec, t_2 = y sec and t_3 = z sec.
Thanks for the additional explanation @Brotfahrer .
So if you had your data sorted in ascending time order, is it reasonable to say that every continuous block of “To P” rows, with no other events in between would be considered a single “group” irrespective of the elapsed time/duration between the rows?
yes that’s correct. The data is sorted in ascending time order (first screenshot column t[s] has the date and time as string in the raw data). I need the delta time for each single “goup”
So I think then that a small additional piece of work is needed to mark the groups based on “event change”.
In the attached there are two approaches. One uses a java snippet with some minimal code to note change of event and give each change a “sequence” number.
The other does the same but uses a combination of Lag Column, Rule Engine and Missing Value to achieve the same result (namely to mark each grouping with a unique sequence number). This sequence number is then what you can use in the GroupBy node.