How to apply XML-RoBERTa model in the BERT Model Selector using URL?

maurice · February 26, 2021, 3:48pm

Hi,
I am trying to add the XML-RoBERTa model in the BERT Model Selector using URL.

I copied the model file in the advanced field URL:
https://dl.fbaipublicfiles.com/fairseq/models/xlmr.large.tar.gz
which I got from fairseq/examples/xlmr at master · pytorch/fairseq · GitHub
which I got from: XLM-RoBERTa — transformers 4.3.0 documentation

and get this:
> ERROR BERT Model Selector 0:813 Execute failed: SavedModel file does not exist at: C:\tmp\96aaca13c84ac93a707e2fdba4863bae02bd5229/{saved_model.pbtxt|saved_model.pb}
> Traceback (most recent call last):
> File “”, line 4, in
> File “C:\Program Files\KNIME\plugins\se.redfield.bert_0.0.1.v202012081121\py\bert_utils.py”, line 8, in load_bert_layer
> return hub.KerasLayer(bert_model_handle, trainable=True)
> File “C:\Users\mauri\Anaconda3\envs\py3_knime_tf2_cpu\lib\site-packages\tensorflow_hub\keras_layer.py”, line 146, in init
> self._func = load_module(handle, tags)
> File “C:\Users\mauri\Anaconda3\envs\py3_knime_tf2_cpu\lib\site-packages\tensorflow_hub\keras_layer.py”, line 398, in load_module
> return module_v2.load(handle, tags=tags)
> File “C:\Users\mauri\Anaconda3\envs\py3_knime_tf2_cpu\lib\site-packages\tensorflow_hub\module_v2.py”, line 102, in load
> obj = tf_v1.saved_model.load_v2(module_path, tags=tags)
> File “C:\Users\mauri\Anaconda3\envs\py3_knime_tf2_cpu\lib\site-packages\tensorflow\python\saved_model\load.py”, line 578, in load
> return load_internal(export_dir, tags)
> File “C:\Users\mauri\Anaconda3\envs\py3_knime_tf2_cpu\lib\site-packages\tensorflow\python\saved_model\load.py”, line 588, in load_internal
> loader_impl.parse_saved_model_with_debug_info(export_dir))
> File “C:\Users\mauri\Anaconda3\envs\py3_knime_tf2_cpu\lib\site-packages\tensorflow\python\saved_model\loader_impl.py”, line 56, in parse_saved_model_with_debug_info
> saved_model = _parse_saved_model(export_dir)
> File “C:\Users\mauri\Anaconda3\envs\py3_knime_tf2_cpu\lib\site-packages\tensorflow\python\saved_model\loader_impl.py”, line 113, in parse_saved_model
> constants.SAVED_MODEL_FILENAME_PB))
> OSError: SavedModel file does not exist at: C:\tmp\96aaca13c84ac93a707e2fdba4863bae02bd5229/{saved_model.pbtxt|saved_model.pb}

There is no documentation on this Redfield extenson on this point.
any help is apreciated to make BERT text classification possible for everyone

Redfield · March 2, 2021, 1:24pm

Hello @maurice

First of all according to your description you pointed to a remote URL with an archive with the model. You have to unpack it, the description on the GitHub also clearly says to do so. It means that you need to download the model, unzip it and provide a local URL.

Another thing is that it seems that this is a model that was built on Pytorch. We were using TensorFlow 2 as a backend, so we cannot guarantee that this particular model would work with our extension. Anyway I would like you to give us this feedback.

If it is not critical, then I would recommend you to look for the models on HuggingFace repository (Hugging Face – On a mission to solve NLP, one commit at a time.), it is easier to fetch models from there.

And finally, BERT-related models are quite a mess in terms of deployment due to everyone trains the models with different frameworks, different packages versions, etc. Unfortunately this means that in many cases this is a lottery to pick a model that would work with our nodes. That’s why we have the lists of trusted models in Model Selector node, that verified and tested.

maurice · March 2, 2021, 2:03pm

Thank you for the informative reply.
Can you please add the link to the GitHub page containing the documentation?
I seem to have missed that.
I only found this link.

But see no documentation conserving alternative models in the readme.md
Thanks in advance for your answer.
Kind regards,
Maurice

system · March 9, 2021, 2:03pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.