Train MNIST classifier

Hub · June 27, 2019, 6:48pm

This workflow trains a simple convolutional neural network (CNN) on the MNIST dataset via TensorFlow.

This is a companion discussion topic for the original entry at https://kni.me/w/G7oIwag-DwpWt9Hf

Pableras · March 8, 2020, 6:49pm

Hello,

One question:

This workflow doesn’t work wit images of real faces or cars/motos.
Do you know what I have to change in the python code to works well??

Kind regards.

nemad · March 9, 2020, 8:27am

Hello @Pableras,

welcome to the KNIME Forum and Hub.
MNIST is a popular dataset for small deep learning examples but as you already pointed out, the used network doesn’t work for other datasets.
The first problem is that your images are likely to have different shapes i.e. image sizes. You could fix that by changing the input shape of the network from [28, 28, 1] to whatever your images require e.g. [224, 224, 3] for a 224x224 RGB image.
However, the images are now much larger which will require a different network architecture.
You could try to add a couple of more convolutional and pooling layers to this workflow but you will probably find that the performance is still far from what you might expect from deep learning.
In order to change this you could try transfer-learning as is done here using VGG.
Note, however that training a network as large as VGG can take some time depending on your available computation power (even though transfer-learning is somewhat less time-consuming).
As an alternative you can use a pretrained network as feature extractor and apply another machine learning approach on top, as is done here.

Cheers,
Adrian

Pableras · March 9, 2020, 4:07pm

Thank yo @nemad.

I’m doing the final proyect of Informatic Engineering which consist in classifier footballer’s face with knime, tensorflow and python. Train MNIST classifier has this 3 points that I need.
I change de image to [200,200,3] in DL python network Creator and works well. But DL python Network learner not work with the same script that in this example.

If I try to add a couple of more convolutional and pooling layers will work well??
Maybe will be better to use the example: Celebrity recognition using AlexNet??

Thank you so much!!

nemad · March 9, 2020, 4:43pm

Are you required to use the Python nodes for this?
If not, then I’d recommend you to use our Keras Layer nodes to define the network and the Keras Network Learner for training.
Under the hood those nodes also use Python but they are much more convenient to use and require fewer adjustments.
As I wrote in my former post, if you are looking for the best results, transfer-learning is the way to go.

The celebrity example uses a different integration (DL4J) which has basically been replaced with our Keras integration because the latter is much more flexible. Note that AlexNet is a precursor of the VGG used in the transfer-learning workflow I referred you to, so that workflow is doing something similar, just using a considerably more powerful network

Pableras · March 14, 2020, 6:07pm

Thanks!!

Yes I have to use Python and if it’s possible use more tensorflow than keras.
Python and Tensorflow are the importing tecnhnologies to use in my proyect.

Pableras · March 14, 2020, 9:10pm

14 Deep Learning/03 TensorFlow/03 Traim MNIST classifier-> I think is the best option to do my proyect. Do you think that it’s colud be possible modify this example to read images of 150,150,3 ?? (adding more conv2 and maxpolling)

nemad · March 17, 2020, 9:46am

That’s of course possible, you just have to adjust the network definition in Python.

Pableras · March 31, 2020, 7:21pm

Hello,

I executed in Knime the example: DeepLearning/Keras/Cats and dogs and Its worked well.The steps are:01 Preprocces Image Data and then 02 Train simple CNN.

Now, I’m trying to execute the same 2 proccess but I add another kind of images, so now, I try to classifficate 3 types of images: cats, dogs and ducks.
My question is: In 02 Train simple CNN at the step Rule Engine-Turn probabilities to class labels-> I dont undertand well this step but I think that I have to change ths script like this:

$0$ > 0.5 AND $0$ < 1 => “Dogs”
$0$ >= 1 => “Ducks”
TRUE => “Cats”

its ok??

Thank you

nemad · April 1, 2020, 7:03am

Hi,
unfortunately, going from two classes to three classes is not so simple in this case.
That’s because the network does a binary classification by predicting the probability of one class (dog), which implicitly gives the probability of the second class as P(cat) = 1 - P(dog).
If you add a third class, you have to adjust a couple of things:

You now need three columns as learning targets, one for the probability of each class (you can easily create these from a single String column containing the class labels by applying the One to Many node).
The output layer of your network now needs to predict the three probabilities P(cat), P(dog) and P(duck), instead of the single probability P(dog). To do this simply change the number of units in the output layer to three and select Softmax as activation function.
Your loss function in the Keras Network Learner needs to change to Categorical cross entropy
Once your network is trained and you predicted your samples with the Keras Network Executor you will receive the three probabilities P(cat), P(dog) and P(duck) (the names in your case depend on what name you specify for the output in the executor). To get to the actual class prediction you can use the Many to One node. Note that you will end up with predictions like Out0, Out1, Out2 if you named the output in the executor Out and you will have to map the names back to the actual class names. Out0 corresponds to the probability of your first class, Out2 to your second, and so on.

Cheers,
Adrian

Pableras · April 2, 2020, 7:09pm

Thank you so much for your answers. Now I have some doubts about the part at The Train Simple CNN (in preprocess image data I don’t change anything):

1-You now need three columns as learning target, one for the probability of each class (you can easily create these from a single String column containing the class labels by applying the One to Many node). -->I added OneToMany after particioning data into Test and Train box.

2- The output layer of your network now needs to predict the three probabilities P(cat), P(dog) and P(duck), instead of the single probability P(dog). To do this simply change the number of units in the output layer two tree and select Softmax as activation function.–> At network creator i change to:
model.add(Dense(3))
model.add(Activation(‘softmax’))

3- Once you network is trained and you predicted your samples with the Keras Network Executor you will receive the three probabilities P(cat), P(dog) and P(duck) (the names in your case depend on what name you specify for the output in the executor). To get to the actual class–> Afer Executor I added Many to one.

Is this OK?? I have to delete the 2 boxes named: RULE ENGINE??

Regards

Pableras · April 3, 2020, 10:24am

nemad · April 3, 2020, 12:40pm

That’s correct, you no longer need the rule engine nodes.

One note on the output layer: Instead of adding the activation separately you can also do model.add(Dense(3, activation=‘Softmax’)).

ScottF · April 23, 2020, 1:42pm

2 posts were split to a new topic: CNN Training Question

Pableras · May 5, 2020, 9:38am

I modified this workflow with 3 kinds of images of shape(21,28,3). The execution is OK but when I test images not work well because the result is always tha same. This is my code to Dl Python network creator. Something wrong???:

variable name of the output network: output_network

import tensorflow as tf
from TFModel import TFModel

input_shape = (None, 21, 28, 3)
num_classes = 3

Create a graph

graph = tf.Graph()

Set the graph as default -> Create every tensor in this graph

with graph.as_default():

# Create an input tensor
x = tf.placeholder(tf.float32, shape=input_shape, name='input')

# Define the graph
# Convolutional Layer #1
conv1 = tf.layers.conv2d(inputs=x, filters=32, kernel_size=[5, 5],
						 padding="same", activation=tf.nn.relu)

# Pooling Layer #1
pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

	
# Convolutional Layer #2 and Pooling Layer #2
conv2 = tf.layers.conv2d(inputs=pool1, filters=64, kernel_size=[5, 5],
						 padding="same", activation=tf.nn.relu)
pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

	
# Convolutional Layer #2 and Pooling Layer #2
conv3 = tf.layers.conv2d(inputs=pool2, filters=128, kernel_size=[5, 5],
						 padding="same", activation=tf.nn.relu)
pool3 = tf.layers.max_pooling2d(inputs=conv3, pool_size=[2, 2], strides=2)

# Dense Layer
pool2_flat = tf.reshape(pool3, [-1, 768])
dense = tf.layers.dense(inputs=pool2_flat, units=512, activation=tf.nn.relu)

# Create an output tensor
y = tf.layers.dense(dense, num_classes, activation=tf.nn.softmax, name='output')

Create the output network

output_network = TFModel(inputs={‘input’: x}, outputs={‘output’: y}, graph=graph)

Pableras · May 5, 2020, 2:53pm

@nemad could you help with this??