Splitting table into sections using timestamp

Hello everyone,

I am very new to KNIME, and I would like to ask for some help for my problem:

Let's say I have a table with a list of simulated transactions with timestamp, and with the Time Difference Node I could calculate the time difference between each row (i.e. Row 2-Row 1, Row 3-Row 2 and so on)

Transaction Timestamp Time Difference in second
smallest unit is also set to second
A 21.04.2017 10:04:59.929 ? (because there is no Row 0)
B 21.04.2017 10:04:59.931 1
C 21.04.2017 10:04:59.933 1
D 21.04.2017 10:06:06.497 67
A 21.04.2017 10:07:14.652 69
B 21.04.2017 10:07:14.653 1
C 21.04.2017 10:07:14.656 1

Based on the time difference, I would like to split the table into sections / subtables to isolate the transactions which occur almost at the same time from those which occur later. The table should be splitted where the time difference is more than 60 seconds. The above table would be splitted in the following 3 sections:

A 21.04.2017 10:04:59.929 ?
B 21.04.2017 10:04:59.931 1
C 21.04.2017 10:04:59.933 1
D 21.04.2017 10:06:06.497 67
A 21.04.2017 10:07:14.652 69
B 21.04.2017 10:07:14.653 1
C 21.04.2017 10:07:14.656 1

Afterwards, each section would be further labeled with different events:
(Since the transactions are simulated, I know which section belongs to which event. For e.g., the first 12 sections are Event 1 and 2 occuring alternatively, i.e. Event 1, Event 2, Event 1, Event 2 and so on. Afterwards, Event 3 and 4 would occur alternatively, each for 6 times. A loop node might be useful here to assign the event to the sections?) 

A 21.04.2017 10:04:59.929 ? Event 1
B 21.04.2017 10:04:59.931 1 Event 1
C 21.04.2017 10:04:59.933 1 Event 1
D 21.04.2017 10:06:06.497 67 Event 2
A 21.04.2017 10:07:14.652 69 Event 1
B 21.04.2017 10:07:14.653 1 Event 1
C 21.04.2017 10:07:14.656 1 Event 1

The transactions will be converted into a string sequence based on the timestamp: (I would say that the GroupBy Node would be one of the suitable approaches)

A B C Event 1
D Event 2
A B C Event 1

Sequences which belong to the same event be combined into a table in the next step:

Event 1

Sequence
A B C
A B C

Event 2

Sequence
D

The question is: How can I implement this approach? How can I split the table based on the time difference? It seems like I should use a loop node, and the main table will be further splitted down loop after loop. Could someone please assist me? Sorry for the long post, but thanks a lot :)

Hi,

I would suggest using a Java Snippet. At least for your first part which generates the Events. 

Afterwards, you can use a groupby node to get the Sequences.

If you provide me with a minimal start workflow, I can make you the example.

Best, Iris 

Hi Iris,

thank you for your reply. I have attached a sample workflow with this reply.

Since I am reading my data from a databank, I created some sample data for you. There are three tables in the workflow:

1. "Sample Raw Data" is the data to be processed

2. "Sample Raw Data with Events assigned" is a table with Events assigned manually, just to be used as a comparison / for checking

3. "Table at goal" is the table to be achieved

In real case I have more than 26 transaction types, that's why I have changed the transaction types to numbers (Transaction 1, Transaction 2, Transaction 3...), and the events to alphabets (Event A, Event B, Event C...)

I don't really have any experience in Java, if you could help me out, I would be very thankful :)

Thanks!

Regards,

Eugene