Improving Image Classification Model for Brand Recognition - Deep Learning

Hello everyone,

I hope you’re all doing well.

I’m posting this message to seek assistance in improving my image classification model. I’ve adapted the workflow for classifying dogs and cats to my specific use case.

I’d like to use images of storefronts or advertisements to recognize certain brands. This involves deciphering logos or brand names on storefronts and signs.

Unfortunately, the expected results are not satisfactory. I’ve tried changing several parameters, such as the image size. Initially, in the dogs and cats workflow, the images are 150150 pixels. I’ve tried increasing the size up to 700750 pixels, as beyond that, the Keras Learning node crashes. I’ve also tried adjusting the number of epochs, up to 7, as beyond that, it also crashes.

Additionally, my dataset is not very large, consisting of only 350 images with an 80/20 split. Unfortunately, it’s challenging to increase the size of my dataset.

So, do you have any solutions to improve my model ? Perhaps regarding the appropriate function or parameters that effectively recognize characters? Or maybe preventing the Learned node from crashing ?

Or improving the quality, here’s a preview after transformation. Note that the original photo is sharper.

What the image look in the image reader node.

image

After rezising to 700*750

image
The original

All suggestions are welcome. Thank you very much.

Have you tried changing other parameters, e.g. learning rate, network layers (convolutions)…

Hello,

Yes, I have tried many changing but I still have a bad result. All predicttions goes to one category.

@Grayfox question is can you split the task into brands that have actual text in them where you might be able to employ OCR and ones that use just logos.

Also: could you expand your training dataset with more examples. Also check methods to artificially expand the range of your training set by creating new images that are maybe blurred or partially hidden.

Also discussing your task with ChatGPT might help to get ideas.

Hi, I want to thank you for the suggestions and your help. I haven’t had time to look at these proposals yet because my research has focused on how to create a CNN model with images of different sizes without having to resize the images.

One proposed solution i have founded was to create a model by specifying the dimensions of the images like this:

1 - Change Flatten to GlobalMaxPool2D
2- Change input shape to (None, None, channels) - repeat None according to the number of image dimensions and the number of channels is mandatory.

So my model is as follows:

variable name of the output network: output_network
import os
os.environ["KERAS_BACKEND"] = "tensorflow"

import keras
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, GlobalMaxPooling2D, GlobalAveragePooling2D
from keras.layers import Activation, Dropout, Flatten, Dense

"Create the model
model = Sequential()

model.add(Conv2D(3, (3, 3), activation="relu", input_shape=(None, None, 3)))
model.add(Conv2D(32, (3, 3), activation="relu"))
model.add(Conv2D(16, (3, 3), activation="relu"))
model.add(Conv2D(8, (3, 3), activation="relu"))

#Replace Flatten with GlobalMaxPooling2D
model.add(GlobalAveragePooling2D())

#Add the output layer with softmax activation
model.add(Dense(2, activation="softmax"))

output_network = model

The node executes correctly.

However, things get complicated at the DL python Learner node level, especially when manipulating the data.

Here is some code that I took from one of your workflows and I tried to adapt to my case :

steps = 50
epochs = 25

image_column = input_table['Image']

x_train = []

for i, img in enumerate(image_column):
    
    img_array = img.array
    img_array = np.transpose(img_array, (1, 2, 0)) 
    x_train.append(img_array)

x_train = np.array(x_train)

y_train = input_table['ClassIndex']
y_train = np.array(y_train)
y_train_encoded = to_categorical(y_train, num_classes=2)

# I comment this, to test if it's work without
#train_datagen = ImageDataGenerator(shear_range=0.2, zoom_range=0.2, horizontal_flip=True)
#gen = train_datagen.flow(x_train, y_train)

input_network.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
input_network.fit(x_train, y_train_encoded, epochs=epochs)

# Output
output_network = input_network

So after execution the Learner Node I have got this error

Error when checking input: expected conv2d_1_input to have 4 dimensions, but got array with shape (11, 3).'

I’ve tried manipulating the data in every way possible to get the right format, whether by adding a column or specifying each dimension of the image. But handling numpy arrays, especially on Knime, is complicated for me.

I’m not sure if it’s possible to apply this method with Knime, I had hope when the node executed at a certain point. But then my workflow crashed without able to save it.

So I lost the code that seemed to work and I’m unable to reproduce it, so many manipulations have been made.

I would like to know your opinion on the matter, maybe you know how to train the model.

Thank you very much.

I am not a DS so just curious but for what reason no resizing? There are already build in solutions for resizing are there not? So the reason probably is not to save time.
br