Deep learning is used successfully in many data science applications, such as image processing, text processing, and fraud detection. KNIME offers an integration to the Keras libraries for deep learning, combining the codeless ease of use of KNIME Analytics Platform with the extensive coverage of deep learning paradigms by the Keras libraries.
Though codeless, implementing deep learning networks still requires orientation to distinguish between the learning paradigms, the feedforward multilayer architectures, the networks for sequential data, the encoding required for text data, the convolutional layers for image data, and so on.
This course will provide you with the basic orientation in the world of deep learning and the skills to assemble, train, and apply different deep learning modules.
This is an instructor-led course consisting of four, 75 minute online sessions run by two KNIME data scientists. Each session has an exercise for you to complete at home. The course concludes with a 15- to 30-minute wrap up session.
Course Content:
Session 1: Classic Neural Networks and Introduction to KNIME Deep Learning Extensions
Session 2: Deep Learning Case Studies and Different Design Components
thank you everyone who attended our deep learning course last week!
Please find below the answers to the open questions from our Q&A session:
Will I get the same results if I use a seed in the Keras Network Learner node?
The seed in the Keras Network Learner node ensures that if the node is re-executed always the same subsets are used for the different epochs. That doesn’t mean automatically that you will get always the same results, as the there are more random variables used when training a neural network, which influence the trained network, e.g. the initialized weights for each layer.
I have a GPU. How can I check whether or not it is used to train the network?
To do so you can check the GPU utilisation. On windows this is possible via Performance Tab in the Task Manager. Another option is to use the command nvidia-smi in the console.
We are using word2vec embeddings of product desriptions for product classification.
The word embeddings are used as imput parameters for a random forest. We started out with a bag-of-words vector, but word2vec improved the model accuracy significantly.
My reasoning says, that the modell performed better because the input vectors contain more information. The next logical step would be to enrich the vectors with context information to improve model performance even further.
My idea would be, to replace the convolutional layer with a transformer/attention layers (like used in BERT) to create a word embedding with added contextual information.
Do you have any idea how this could be implemented in KNIME?
For this task I would use the TF2 integration of KNIME Analytics Platform and would probably download, maybe fine tune, and apply a pretrained BERT model, optimally pertained on some data similar to your data. A great resource for pre-trained model is the TensorFlow Hub: TensorFlow Hub
These two example workflows show, how a BERT model can be applied using the TF2 integration and some Python code.
Important: In addition to the BERT Model you always need the according preprocessing, also available on the TensorFlow Hub
I was able to get embeddings using the nodes from the first workflow (the Covid Dataset).
I would like to use a different BERT Model, because to me it looks like the Model used in the workflow is trained on Covid text data.
I cannot get it to work with different Models from the tf hub.
Another thing I´m struggling with is how to fine-tune the network with my own data and where to find the names of the different layers so I can choose the right output layer.
I would like to enrich the nlp classification model I talked about earlier with information from images.
The model uses word2vec embeddings as an input for a random forest classifier.
Instead of creating a seperate CNN for image classification, I would like to add embeddings from the product images as additional inputs to the existing classification model.
The reason I would like to try this, is because it is computationally very efficient and with our data, we got similar results to more complex architectures.
Which layers would I need to create a “image2vec” model?
Is Input-Convolution-Dense Layers enough or do I need max pooling and flatten layers as well?
Also the activation fuctions are giving me a small headache. ReLu creates a sparse vector with lot of 0s. I would go with Sigmoid or are there more obvious choices?
I am very excited about this Course. I already did the Session 1 exercise and got questions in the process:
In the 02_Adult_Data_Classification problem. I realized that if I use the Tahn function for the output Keras ANN node the results mess up completely. This brings me to the realization that the data structure and values range determine the type of activation function needed. Again, I will love to have a reference book that helps me to understand how to optimize the number of neurons in the hidden layers, the type of activation function, the numbers of layers needed and the biases.
in our second session we will talk more about when to use which activation function in the output layer and which loss function as these two settings depend on the problem at hand.
In the case of the exercise we had a binary classification, where the two classes are encoded with 0 and 1. In this case we can use the sigmoid activation function with one neuron in the output layer to train the network to predict the probability for the class encoded as 1. The recommended loss function in that case is binary cross entropy.
Hi,
you are doing a great course. I have a question about Data Leak & Normalization. At task 3:01 I get the warning “Normalized value is out of bounds” with the 0-1 min-max normalization (node 3:112 & 3:86). The reason is probably that a value should be normalized that is outside the training spectrum. Can I just ignore the warning?
You are correct. Some observations in the testing data set lie beyond the min & max of the training data set, resulting normalized values of <0 or >1. These are relatively rare occurrences, and for the most part the range of the testing data is comparable to that of the training data set. Since many activation functions do not have bounds on the input and can take any real numbers, I wouldn’t worry too much about this warning.
Hi Satoru,
thx for your explanation. I have some further questions connected to the section 2 exercise:
Why do we need the normalization & if we cancel, do we have a significant effect to the results?
Is the / why is the performance of an autoencoder for detecting anomalies better than a binary classification with a neural network?
If i don’t use a neural network for this task, i have to think about imbalanced classes and alignment of the class-size. Can I avoid this consideration/todo with an neural network?
Hi,
unfortunately I had to leave the yesterday session a few minutes earlier. Was the question autoencoder vs ARIMA discussed yesterday? I didn’t find any discussion in the recording - or didn’t recognize it.
For your first question, it is a good idea to normalize numerical data, not only for deep learning but also for any machine learning models. You need to normalize your data so that numerical features with a large magnitude won’t dominate in the model.
For your second question, we use an autoencoder, as opposed to a classifier, to detect anomalies because anomalies are very rare. In our example, there are 300,000 cases of normal observations and 500 anomalies. There is a tremendous imbalance in the outcome classes. Moreover, anomaly cases may be very heterogeneous and putting them into the same class in a classification model is not a good idea.
I am not sure if I understand your third question, but let me try to answer. So highly unbalanced classes create a challenge for neural network models, just like any other classifier. So we can’t simply build a deep learning classification model for anomaly detection.
but I dont know where. The message in the python network editor is:
2022-07-14 20:01:37.906781: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
Metal device set to: Apple M1 Ultra
2022-07-14 20:01:37.907004: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) → physical PluggableDevice (device: 0, name: METAL, pci bus id: )
systemMemory: 128.00 GB
maxCacheSize: 48.00 GB
WARNING:tensorflow:No training configuration found in save file, so the model was not compiled. Compile it manually.
below some example code, which I use in one of my workflows to download a pretrained VGG16 model. Do you want to give your workflow a try using this code to import the VGG 16 model?
# This script is partially copied from: https://keras.io/applications/#applications
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
# create the base pre-trained model
base_model = VGG16(weights='imagenet', include_top=False,
input_shape=(150, 150, 3))
output_network = base_model
Amazing, your code also worked out. However, the problem that I had, was coming from a different place and I managed to solve it. I describe it here in case of someone experiences the same.
The error I received was in the Keras Network learner, and it prevented me to run the node. The error warning was as follows:
“Incompatible Port spect at port 0, expected: DLKerasNetworkPortObjectSpecBase, actual: TF2NetworkPortObjectSpec”
In this case, to solve this problem, I change the preferences:
KNIME Preferences > KNIME > Python Deep Learning > Library used for the “DL Python”