DL4J Integration - Autoencoder network

Hi all,

Anyone could share an example workflow implementing a stacked autoencoder network? I'm trying to build a toy proof of concept for anomaly detection, based on reconstruction error for each sample (probably doing lots of things wrong) but no luck so far with this...

Best regards, and many thanks for your help,

--

Jorge

 

 

Hi menuetto,

you can find a description how to build Deep Autoencoder in the DL4J documentation here. Basically you would create the network like shown in the code example on the linked site and use a Pretraining Learner Node at the end. In the original Deep Autoencoder proposed by Hinton the second half of the network would be mirrored version of the first half with shared weights. Unfortunately, this is not yet implemented in the DL4J library so you would have to train the whole network ant not just the first half. After training has finished you can check a Checkbox in the Predictor Node to append a error for each predicted example which you can use for anomaly detection.

Note: There may be a bug in the DL4J Autoencoder which always leads to increasing error during training so maybe you run into that problem. We are looking into that.

I hope that helps and I'm sorry that you ran into some problems.

Best Regards

David

Hi David, thank you for your response!

 

I was afraid that I was doing something horribly wrong, but after reading your message I think that my workflow topology is aligned with your description (I will share it once I overcome my stage fright ;-), so at least now I'm  reasonably sure that this is not the problem... However it is very likely that I still have to do a lot of tuning regarding hyperparameters (tons of things yet to learn about neural networks), I will update the thread with any finding.

Nevertheless I will keep an eye on any news about the bug you mention just in case, because it is true that my main problem is to get the error measure thrown by the Predictor Node into reasonable levels: I'm using a highly sparse, binary-encoded 205-dimensional input  and, after pretraining,  the activations corresponding to the output layer are always around the center of the hypercube (and the error is very high)...

Still working, I will share any advance :-)

Thank you for your help, and to all the KNIME team for the *superb* work!

Best regards,

--

Jorge

 

 

Hi all,

First things first, congratulations (and thanks a lot) to the development teams for the v3.3, great work!!

Next, please excuse me if the questions below are too dumb (I'm in no way a ML expert!). Now, let's go...

I've been still struggling to assemble a proof of concept based on a deep autoencoder for anomaly detection (thank you for the info David!), but unfortunately no luck yet...

Some thoughts on this after some hours of experimentation (note that following points are adapted to the new DL4J nodes in KNIME v3.3.

- Pretraining phase seems to work fine (loss value decreasing with number of epochs)

- After FF Learner (Pretraining) node, I assume that another node will be needed in order to execute the finetuning but...

- As far as I see, the only node capable to do that is the FF Learner (Classification) node (do Finetuning checkbox) , but this one doesn't seem to fit well to the task that I want to execute , as I'm trying to minimize the overall reconstruction error of the input features, not a classification error against a given label.

- Going further, and taking a peek at the source code, my impression is that the classification node is designed to work on multiclass (1-of-N ) problems, not multi-label (M-of-N)

So... 

It is possible to build an autoencoder network on KNIME such that the reconstruction error (i.e. some difference measure between the network input and ouput for the training sample set) is minimised? Has anyone managed to do this?

Thank you for your time and support, 

 

--

Jorge

Hi,

I haven't found a way to solve this task using out-of-the-box KNIME nodes (I guess it could be done writing a custom node or modifying one of the existing ones...), but finally I think I've found a simple way to do it using a Python Snippet talking to an H2O (http://www.h2o.ai) instance.

At first, I thought on using the provided DL4J libraries using the Java Snippet node but unfortunately it seems that it doesn't allow to operate over the entire table, as it is designed to work on a row by row basis.

I attach my first attempt on this, a toy workflow, just in case someone out there find it useful (DISCLAIMER: I'm in no way a ML expert). I also attach a screen capture where you can see the results I obtained. "Outlierness" of each data point is represented using a color/size scale. As you can see, the autoencoder reconstruction error converged after 2nd epoch.

Best Regards,

--

Jorge

Hi menuetto,

your workflow looks very interesting. We plan to support that directly using the DL4J Nodes in future. However, the bug in the DL4J Library regarding the Autoencoder slowed us down a bit in this regard. Therefore, it is very cool to hear that the Pretraining phase seems to be working now (thanks for your experiments).

Best,

David

Hi Jorge & David,

from reading a few of Jorge’s posts and some of mine it feels like I now am where Jorge was three years ago (anomaly detection) but since he still claims today to not be an ML expert I feel I need to state I’m probably even less qualified :wink:

That said I was wondering if there is any progress in this area? I’ve seen there is an autoencoder node in the DL4J extension but before I dig into this it would be great to understand if the bug is removed meanwhile… and if you know of any example workflows making use of that?

Thanks,
Mark