Null Pointer Exception from 'Gradient Boosted Trees Predictor' node

I'm tinkering with different features in my dataset and suddenly the GBT Predictor node started to throw NullPointerExceptions:

ERROR Gradient Boosted Trees Predictor 0:23       Execute failed: ("NullPointerException"): null
DEBUG Gradient Boosted Trees Predictor 0:23       Execute failed: ("NullPointerException"): null
java.lang.NullPointerException

There is no more information, and no stack trace. As far as I can tell by some experiments, the error always happens with a specific row in the dataset, but I'm not able to really isolate the row which causes the error b/c the dataset is too large. Any hint what might be wrong?

-- Philipp

Additional info: Restarting KNIME and re-running the node still gives the NPE, but this time there's at least the stack trace:

ERROR Gradient Boosted Trees Predictor 0:23       Execute failed: ("NullPointerException"): null
DEBUG Gradient Boosted Trees Predictor 0:23       Execute failed: ("NullPointerException"): null
java.lang.NullPointerException
	at org.knime.base.node.mine.treeensemble2.node.gradientboosting.predictor.classification.LKGradientBoostingPredictorCellFactory.getCells(LKGradientBoostingPredictorCellFactory.java:145)
	at org.knime.core.data.container.RearrangeColumnsTable.calcNewCellsForRow(RearrangeColumnsTable.java:503)
	at org.knime.core.data.container.RearrangeColumnsTable$ConcurrentNewColCalculator.compute(RearrangeColumnsTable.java:732)
	at org.knime.core.data.container.RearrangeColumnsTable$ConcurrentNewColCalculator.compute(RearrangeColumnsTable.java:1)
	at org.knime.core.util.MultiThreadWorker$ComputationTask$1.call(MultiThreadWorker.java:442)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:328)
	at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:204)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
	at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)

fwiw: Trying to re-run the node afterwards still gives the NPE, but again without the stacktrace.

Hi Philipp,

this definitively shouldn't happen. Could you provide a small example workflow which we can use to reproduce the problem?

Thanks for reporting!

Christian

Servus Christian,

thanks for the quick reply. Attached is a sample workflow reduced as much as possible, showing my issue. 

In case that's relevant, here's my config: macOS X 10.12.3 with most recent KNIME 3.3.1.

Best,
Philipp

Hi Philipp,

the problem is caused by a NaN value in Row18244.

Until this bug is fixed, a simple workaround is to replace all NaNs by missing values (I know that sounds odd) and it should work. See attached a small workflow showing the workaround using a Java Snippet for the replacement.

Note that this bug also affects the Simple Regression Tree and all Tree Ensemble and Random Forest nodes.

Cheers,

nemad

Hi named,

stupid me, I hadn't considered checking for NaNs at all. Thank you, this workaround is very helpful for now.

Best,
Philipp