I am very new to KNIME, and I would like to ask for some help for my problem:
Let's say I have a table with a list of simulated transactions with timestamp, and with the Time Difference Node I could calculate the time difference between each row (i.e. Row 2-Row 1, Row 3-Row 2 and so on)
Transaction
Timestamp
Time Difference in second
smallest unit is also set to second
A
21.04.2017 10:04:59.929
? (because there is no Row 0)
B
21.04.2017 10:04:59.931
1
C
21.04.2017 10:04:59.933
1
D
21.04.2017 10:06:06.497
67
A
21.04.2017 10:07:14.652
69
B
21.04.2017 10:07:14.653
1
C
21.04.2017 10:07:14.656
1
Based on the time difference, I would like to split the table into sections / subtables to isolate the transactions which occur almost at the same time from those which occur later. The table should be splitted where the time difference is more than 60 seconds. The above table would be splitted in the following 3 sections:
A
21.04.2017 10:04:59.929
?
B
21.04.2017 10:04:59.931
1
C
21.04.2017 10:04:59.933
1
D
21.04.2017 10:06:06.497
67
A
21.04.2017 10:07:14.652
69
B
21.04.2017 10:07:14.653
1
C
21.04.2017 10:07:14.656
1
Afterwards, each section would be further labeled with different events:
(Since the transactions are simulated, I know which section belongs to which event. For e.g., the first 12 sections are Event 1 and 2 occuring alternatively, i.e. Event 1, Event 2, Event 1, Event 2 and so on. Afterwards, Event 3 and 4 would occur alternatively, each for 6 times. A loop node might be useful here to assign the event to the sections?)
A
21.04.2017 10:04:59.929
?
Event 1
B
21.04.2017 10:04:59.931
1
Event 1
C
21.04.2017 10:04:59.933
1
Event 1
D
21.04.2017 10:06:06.497
67
Event 2
A
21.04.2017 10:07:14.652
69
Event 1
B
21.04.2017 10:07:14.653
1
Event 1
C
21.04.2017 10:07:14.656
1
Event 1
The transactions will be converted into a string sequence based on the timestamp: (I would say that the GroupBy Node would be one of the suitable approaches)
A B C
Event 1
D
Event 2
A B C
Event 1
Sequences which belong to the same event be combined into a table in the next step:
Event 1
Sequence
A B C
A B C
Event 2
Sequence
D
The question is: How can I implement this approach? How can I split the table based on the time difference? It seems like I should use a loop node, and the main table will be further splitted down loop after loop. Could someone please assist me? Sorry for the long post, but thanks a lot :)
thank you for your reply. I have attached a sample workflow with this reply.
Since I am reading my data from a databank, I created some sample data for you. There are three tables in the workflow:
1. "Sample Raw Data" is the data to be processed
2. "Sample Raw Data with Events assigned" is a table with Events assigned manually, just to be used as a comparison / for checking
3. "Table at goal" is the table to be achieved
In real case I have more than 26 transaction types, that's why I have changed the transaction types to numbers (Transaction 1, Transaction 2, Transaction 3...), and the events to alphabets (Event A, Event B, Event C...)
I don't really have any experience in Java, if you could help me out, I would be very thankful :)