Loading Pretrained word2vec Models with Word Vector Model Reader (Deeplearning4J)

aherzberg · June 26, 2020, 7:49am

Dear all,
I’m trying to load pretrained word2vec models. Loading of every model fails (even the ones suggested in the description f. e. Google News Vector Binary). The error message is similar the same, depending on which model I try:

ERROR Word Vector Model Reader 0:115 Execute failed: Cannot allocate new IntPointer(8): totalBytes = 3G, physicalBytes = 6G

or

ERROR Word Vector Model Reader 0:115 Execute failed: Cannot allocate new FloatPointer(300): totalBytes = 319M, physicalBytes = 3G

I already set the knime.ini to 32G so Java Head Space should not be a problem… What am I missing?
Does anybody solve this and was able to lead German models? Which ones? I tried these without luck German Word Embeddings | deepset

Many thanks and kind regards, Andy

DaveK · July 3, 2020, 4:46pm

Hi @aherzberg,

DL4J uses off-heap memory independent of Java’s heap space. It could be that the available off-heap memory limit is too low. It can be adjusted in the Deeplearning4J Integration preference page. To avoid issues, the Java heap space limit should be configured to be at most max_available_ram - dl4j_off_heap_limit. E.g. if you have 32 GB of ram available and set the DL4J off-heap limit to 10000 (MB), Java heap space should be at most 22 GB, as Java can’t know about the used off-heap memory.

I hope that helps.

Cheers
David

aherzberg · July 28, 2020, 9:52am

Hi @DaveK,
fantastic - your suggestions work. I was able to load a German model now
Thanks a lot an best regards!
Andy

system · August 4, 2020, 9:52am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.