Doc2Vec Learner Taking Forever?

Hi guys,

So, I have about 94 resumes that I am trying to push through the Doc2Vec Learner with the resume ID column being the label column. Each resume ID is an alphanumeric text field. I executed the process once already before I went to an 1.5 hour lunch but when I got back the process was still at a point where that blue square at the bottom of a node is moving left and right but without the percentage. The whole time this thing was maxing out my CPU. I cancelled it at about a 2 hours mark.

Is this expected?

Hi cageybee,

Few questions to understand better what could be the issue here:

  1. Are you feeding a document column and a string column to the Doc2Vec node?
  2. Did you try to increase the heap space memory for KNIME Analytics Platform?
  3. Have you already applied the text processing phases to the texts you are analyzing?
  4. Would it be possible to share the wf?

Thank you,
Cheers,
Vincenzo

Vincenzo,

  1. When I use a string column instead of a document column, then Doc2Vec works.
  2. I have 4GB+ allocated for the heap space.
  3. When I ran the document column, I think some text processing was applied. But when I ran it the string column, I didn’t do any processing to it.