out of memory with mining nodes

Performance issues are difficult to track, so I give you my collection of discussions and tips of what to do about it.

Could you be a little bit more specific about what kind and size of data you are dealing with and especially what kind of model you are trying to use (from my experience the Weka nodes are especially ‘hungry’).

In general if your machine is not powerful enough there is litte choice but to upgrade the machine or reduce the data set. The last part can be done in a meaningful way by dimensionality reduction, just to give a few hints:

  • remove highly correlated variables (that basically contain the same information)
  • remove variables with little variance
  • reduce dimensions thru PCA (principal component analysis) or smth.

KNIME performance

Process 900+ CSV files

3 Likes