Loading Pretrained word2vec Models with Word Vector Model Reader (Deeplearning4J)

Dear all,
I’m trying to load pretrained word2vec models. Loading of every model fails (even the ones suggested in the description f. e. Google News Vector Binary). The error message is similar the same, depending on which model I try:

ERROR Word Vector Model Reader 0:115 Execute failed: Cannot allocate new IntPointer(8): totalBytes = 3G, physicalBytes = 6G

or

ERROR Word Vector Model Reader 0:115 Execute failed: Cannot allocate new FloatPointer(300): totalBytes = 319M, physicalBytes = 3G

I already set the knime.ini to 32G so Java Head Space should not be a problem… What am I missing?
Does anybody solve this and was able to lead German models? Which ones? I tried these without luck https://deepset.ai/german-word-embeddings

Many thanks and kind regards, Andy

Hi @aherzberg,

DL4J uses off-heap memory independent of Java’s heap space. It could be that the available off-heap memory limit is too low. It can be adjusted in the Deeplearning4J Integration preference page. To avoid issues, the Java heap space limit should be configured to be at most max_available_ram - dl4j_off_heap_limit. E.g. if you have 32 GB of ram available and set the DL4J off-heap limit to 10000 (MB), Java heap space should be at most 22 GB, as Java can’t know about the used off-heap memory.

I hope that helps.

Cheers
David

1 Like