Hi @dev_python , welcome to the community!
The basis for the way I solved this (fingers-crossed it works!) was to work out what data was going to be useful, and what was “noise”. I put together a sample data set as follows:
I’m assuming that you can already get any “NaN” values to be represented as 1 in your input table, so this was my starting point.
So to find your “buy and sell” prices, we need to find the groupings and discard rows that don’t give us anything. I start by finding a way to “simplify” the meaning of the numbers. So any value of 1 in the price column tells me it is Not Invested, and any other value tells me it IS invested. So I created an “Invested” column containing a flag of 1 for invested and 0 for not invested. In this case I used integers rather than booleans but you could use whatever.

Having established whether a row was invested, I wanted to know if the rows either side of it were invested. We need this to establish start and end of a period.
If we number our rows (RowNum), and make sure they are in ascending date order, then we can use a Lag Column node to tell us the number of the previous row (RowNum - 1). Using a pair of Cell Replacers, we can join between RowNum and RowNum -1, once in one direction (for previous row) and then in the other direction (for next row) to return the Invested flag for the prev and next rows. A small tidy up was required to handle the FIRST and FINAL rows of the data set. It is assumed that in these cases the imaginary row prior to the first row would not be invested, and neither would the row AFTER the final row. This then ensures they are treated as a BUY and SELL respectively.


After that, we can then use a rule that determines if a particular invested row is the beginning, end or somewhere in the middle of the particular block. We aren’t interested in the ones in the middle so we want to discard them. Here then we want to keep only rows where it is invested and either the row before, or the row after wasn’t invested.

This should then result in us having only alternate rows which are start and end of investment blocks. (Except that I then realised there was an edge-case that if an investment were both bought and sold on the same date, this wouldn’t be the case, so I inserted a couple of nodes just prior to the discard that found any “isolated” investments and duplicated them so that they would form a start-end pair of same date and price.

The final part was then to group the rows into pairs. Using the rowindex was the simplest way I found to do this so the odd row number in each pair was using as a “block number”. This could then be grouped with a groupby on “Block” returning min, max dates, and first and last prices for each block.


I hope it works, or at least gives some ideas for your own data set. Attached is my workflow, and do let me know if you spot any errors, or need any further assistance
KNIME_time_series_chunks.knwf (39.6 KB)