Deep learning + text processing question (word2vec + CNN)

Dear KNIME wizards

I am trying to generate text classification using deep learning. I am able to train a word2vec (or doc2vec) model (using Word Vector Learner node) and learn/make predictions with traditional data mining classifiers like decision trees/ naive bayes/ SVM.

I am getting stuck when I want to use a Convolutional Deep Learning network for classification. It looks like the Convolutional deep learning in KNIME was designed only for images (it has image specific parameters that must be defined) and don’t take output from word2vec.

Does anybody have an idea on how to use a Convolutional deep learning for text classification in KNIME? I attach a workflow with an example for your tweaking. I would appreciate an example workflow, if anybody has it.

Thanks!! 

Hi mtopaz,

you are perfectly correct, typically convolutional networks are used for images. However, the use case depends on the type of convolution. In the case of images most often 2 dimensional convolution is used because of the 2D nature of images. For text you would then use 1D convolution because there is no second dimension (you can picture the 1D convolution like a context window for a word sliding over the text).

In KNIME we currently only support 2D convolution (hence you need to input an image size in the learner node configuration). However, in the example workflow we can use a little trick and just interpret the input vector like an image. In this case the input vectors have a length of 200. Therefore, we could use an arbitrary image size that multiplies to 200 e.g. 10 (height) x 20 (width) x 1 (channels) and enter this into the learner node configuration. Then the input vector is interpreted as a 10x20x1 dimensional image. Furthermore, one needs to be careful that the configuration of the convolution layers is compatible to the input regarding the kernel size, stride, and amount of pooling and convolution layers in general. The main problem here is that if these parameters are too high (too high for a specific image input size) then we may reduce the image size to a negative number which will throw an configuration error. This can be a bit tricky to do but I attached a workflow with a working configuration.

Additionally, I added model training + scoring using a MultiLayerPerceptron. This is generally better suited if your input are not images.

If you have further questions I'm happy to help.

Cheers

David

 

Tahnks so much David- this is a very detailed and helpful information. Awesome work! 

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.