Execute failed: Blas GEMM launch failed

Byron05 · December 1, 2020, 5:35am

Running what I thought was a simple test using the RedField BERT extensions. My workstation has 2 x 2080ti nvidia GPUs - nothing else is running. I’ve got no problems running BERT models in other projects, albeit outside of Knime.

I thought it might be a Keras configuration issue, but I’m not sure.

Has anyone else run into this?

Error message below:

ERROR BERT Classification Learner 3:27 Execute failed: Blas GEMM launch failed : a.shape=(10240, 2), b.shape=(2, 768), m=10240, n=768, k=2
[[{{node functional_1/keras_layer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/transformer_encoder/StatefulPartitionedCall/type_embeddings/MatMul}}]] [Op:__inference_train_function_44360]

Function call stack:
train_function

Traceback (most recent call last):
File “”, line 15, in
File “D:\Program Files\KNIME\plugins\se.redfield.bert_0.0.1.v202011121720\py\BertClassifier.py”, line 120, in run_train
classifier.train(input_table, class_column, batch_size, epochs, optimizer, progress_logger, fine_tune_bert, validation_table)
File “D:\Program Files\KNIME\plugins\se.redfield.bert_0.0.1.v202011121720\py\BertClassifier.py”, line 64, in train
shuffle=True, validation_data=validation_data, callbacks=[progress_logger])
File “C:\Users\User\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\keras\engine\training.py”, line 108, in _method_wrapper
return method(self, *args, **kwargs)
File “C:\Users\User\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\keras\engine\training.py”, line 1098, in fit
tmp_logs = train_function(iterator)
File “C:\Users\User\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\eager\def_function.py”, line 780, in call
result = self._call(*args, **kwds)
File “C:\Users\User\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\eager\def_function.py”, line 840, in _call
return self._stateless_fn(*args, **kwds)
File “C:\Users\User\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\eager\function.py”, line 2829, in call
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File “C:\Users\User\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\eager\function.py”, line 1848, in _filtered_call
cancellation_manager=cancellation_manager)
File “C:\Users\User\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\eager\function.py”, line 1924, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File “C:\Users\User\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\eager\function.py”, line 550, in call
ctx=ctx)
File “C:\Users\User\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\eager\execute.py”, line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(10240, 2), b.shape=(2, 768), m=10240, n=768, k=2
[[{{node functional_1/keras_layer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/transformer_encoder/StatefulPartitionedCall/type_embeddings/MatMul}}]] [Op:__inference_train_function_44360]

Function call stack:
train_function

ScottF · December 1, 2020, 9:13pm

Hi @Byron05 -

We’ve done a bit of testing recently on the BERT nodes, but I don’t remember seeing this particular error pop up. Let me page @Artem and see if he can shed any light on this.

Redfield · December 2, 2020, 7:40am

Hello @Byron05 I have never faced this issue, so let’s investigate it together.
First of all I would suggest you to check the software compatibility: the CUDA and Python.
In our tests we used TF 2.2.0, libcudNN 7.6.5, CUDA 10.1.
I did a brief search on your error message, and unfortunately there is no ready solution.
It seems that memory is not a issue in your case, but please check this post, where 3 potential solutions are described:

I would also be glad to know a bit more about your Python environment. Our recommendation is to install TF2 extension, then in the settings of Knime you will be available to create a deeplearning Python environment, so please create one for TF+GPU, then install all the packages in this environment (I would recommend to use Conda for this) mentioned here in the workflow description:

Please let me know if you have any progress, your feedback is valuable to improve our extension.

Best regards,
Artem.