I don't know what to make of this latest problem I encountered. I created a simple class called "Distribution" into which I wanted to load column data from the InPort. I then use a RowIterator to iterate over each row in order to extract the data. Everything was looking fine until suddenly the performance of the RowIterator plummeted to a ridiculously slow pace (though still produced the correct output). Out of pure frustration, I came up with this test code:
//Iterate down InPort[1] -- 10,000 rows takes 2 seconds
RowIterator iterator_01 = inData[1].iterator();
while( iterator_01.hasNext() ) {
DataRow row = iterator_01.next();
}
m_distributions.put("MyDistribution", new Distribution() );
//Iterate down InPort[1] again -- 10,000 rows now takes 5 minutes!!!
RowIterator iterator_02 = inData[1].iterator();
while( iterator_02.hasNext() ) {
DataRow row = iterator_02.next();
}
public final class Distribution {
private String m_name;
private int m_index;
... more Strings, ints, doubles, and a single double[~10000]
}
[Actually not quite this simple as the Distribution was the Value in a HashMap].
After a lot of banging-head-against-wall I finally changed the class name of "Distribution" to "CustomerDistribution". Suddenly everything was back to normal! But I'm baffled - there doesn't seem to be any points of commonality between my class and KNIME. Is there something happening deep inside KNIME or Java that is saying "hey - I wonder if this 'Distribution' has something to do with that 'Distribution' - I better spend some time checking".
While I'm here, I might as well raise a second question. I am doing a lot of column-wise data crunching, which is why I'm loading all of the data into my own Distribution structures. But I worry that this is not very KNIME-like as it means that I am largely ignoring the power of KNIME until I want to push my results to the OutPort. I tried to find samples of column-wise data manipulation in KNIME, but when I found that a lot of the statistics in KNIME are calculated in the same way (that is, first pull out all the data into a separate array and then work on it) I concluded that KNIME is only designed for row-wise data manipulation.
Is KNIME also good at doing column-wise data manipulation?