Cumulative Binning a Variable Integer Array per Row

I'm a bit new to KNIME so I hope this isn't duplicate or too trivial.

My dataset has a variable length unsigned integer array for each row that I would like to put into cumulative bins. I don't think I can do this with the built-in nodes, but before I resort to a Java snippet I was hoping someone has a better idea.

Here is a sample dataset, the rowID and IntArray are the input columns. Obviously the last bin's value always equals the array length.

+-------+---------------+------+-------+-------+--------+
| RowID |   IntArray    | Bin9 | Bin19 | Bin29 | Bin30+ |
+-------+---------------+------+-------+-------+--------+
|     1 | [8,20,25,120] |    1 |     1 |     3 |      4 |
|     2 | [27]          |    0 |     0 |     1 |      1 |
|     3 | []            |    0 |     0 |     0 |      0 |
+-------+---------------+------+-------+-------+--------+

 

The real dataset would have around 60 bins, so this is quite annoying to do in a Java snippet.

Any tips are welcome! Thanks!

Hi Patricia,

no this is not at all trivial, actually a pretty nice use case. I can not give you a ready to use solution, but I can give you some pointers. First I would ungroup the IntArray into mutlitple rows (one per rowID) (this was the simple part :))

Now for the binning, you can use the dictionary binner but you would need to generate the dictionary, maybe using the autobinner and some preprocessing afterwards.

I hope this will get you started, let me know if you have any other further questions.

Best, Iris

Hi Iris, thanks for the answer!

I ended up doing this in a not-so-nice Java Snippet outputting an array indexed by the bin. And I managed to convince the receiver to the data to have mutually exclusive bins instead of cummulative ones!

It's good that KNIME has Java as a fall-back solution, although I feel like I'm "falling back" too often ;)