DL4J deep learning with onehot encoded variables

longoka · January 17, 2019, 4:29pm

Is the ‘DL4J feedforward learner (Regression)’ node capable of receiving/using variables as onehot encoded variables to train and test on a continuously distributed target variable?

I can successfully use onehot-encoded variables (as integer columns) to train a model using the ‘Mining:Linear/Polynomial Regression:Linear Regression Learner’ node.

The same data matrix appears to process normally through the DL4J regression learner node, but the subsequent predictions are all singular.

By way of example, I’m attaching a workflow that uses simulated data whereby DNA base = A at two independent base positions has an additive effect on the ‘Value’ column (see data figure). You can get a logical result from the typical linear regression learner module of the workflow, but not the deep learning module.

Any insights appreciated.

base_demo_two_bases_DL_v0.1.knwf (47.2 KB)

DaveK · January 20, 2019, 11:02am

Hi longoka,

I had a look at your workflow. The parameters of the DL4J Regression Learner node where a bit off. Especially, the batch size, number of epochs, and global learning rate. Also, I increased the number of neurons in your network. Unfortunately, the MLP model you are using is not as plug&play compared to the Linear Regression Learner as there are many more parameters to tune. I attached a workflow with changed parameters. Using this I get a similar score using the MPL as using the linear regression.

base_demo_two_bases_DL_v0.1.knwf (214.1 KB)

Cheers
Dave

longoka · January 21, 2019, 1:43am

Thank you Dave; you really helped bridge my intuition from more simple regression problems to the current problem, and how to appropriately deal with it in Knime:DL4J.

system · January 28, 2019, 1:50am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.