Set both Batch Size and Number of Training Iterations for Doc2Vec?

I am using the KNIME Doc2Vec Learner node to build a Word Embedding. I know how Doc2Vec works. In KNIME I have the option to set the parameters

- Batch Size:  The number of words to use for each batch.
- Number of Epochs: The number of epochs to train.
- Number of Training Iterations: The number of updates done for each batch.

From Neural Networks I know that:

 - one epoch = one forward pass and one backward pass of *all* the training examples
 - batch size = the number of training examples in one forward/backward pass. The higher the batch size, the more memory space you'll need.
 - number of iterations =  number of passes, each pass using [batch size] number of examples. To be clear, one pass = one forward pass + one backward pass (we do not count the forward pass and backward pass as two different passes).

As far as I understand it makes little sense to set batch size *and* iterations, because one is determined by the other (given the data size, which is given by the circumstances). So why can I change both parameters?

 

HI M42,

generally you are perfectly right. However, in this case (the way its done by DL4J) the ‘Number of Training Iterations’ means how often this batch is used to make an update. E.g. if the value is set to 2, it just takes each batch twice. That’s why you can configure both in the node.

I hope that answers your question.

Cheers
David

2 Likes

Thanks, but not quite. If this is so, then I don’t see the difference between “number iterations” and “number of epochs”. Taking the batch twice, would be the same as twice as many epochs.

Apart from the order the batches are used for training (which makes a difference but not a big one) it’s the same. That’s just the way its done by DL4J, there is no further reason for that.

Cheers
David

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.