Spacy Model Selector -> Error Github

Hi,

try to this workflow up and running - but somehow. Workflow execution is throughing an error.

News relevance webinar – KNIME Community Hub

Blockquote
ERROR Spacy Model Selector 4:1320 Execute failed: github.com
Blockquote

Any Idea ? Which log files are the right ?

i am using knime 4.7.2
Thanks for your help/ideas how to get that fixit.

Sven-Olaf

Beside that i have to fix the Redfiled NLP Nodes.

Blockquote
Livrary spacy is not properly installed. Details:issubclass() arg 1 must be a class.

Blockquote

I found the solution here:

https://knowledge.broadcom.com/external/article/267049/typeerror-issubclass-arg-1-must-be-a-cla.html#:~:text=This%20is%20a%20problem%20that%20is%20caused%20by,should%20execute%20successfully%3A%20python%20-m%20spacy%20download%20en_core_web_md

Furthermore, i got that output.

Blockquote
WARN Tag Filter 5:1323 Selected tag type “SPACY_POS_en_core_web_md-3.3.0” could not be found.
This might happend because it is either a dynamic TagSet that is undefined for the selected column ‘Processed document’
or it is a TagSet defined by a missing language extension!
Install additional language extensions at File->Install KNIME Extensions.
Blockquote

In which Log can i find more information (details) ?

Hi,

see here how to find the log files. @Artem from @Redfield might have more clues after you provided the log files. Furthermore it looks like the failing node is in the component Processing in your workflow. Maybe in the component you see some more details? Could you post an image of the content of the component?

I don’t understand the structure of your posts. Is the part about livrary spacy another issue which you experience with a proposed solution which the maintainers might include in their extension?

The third part seem to be about the Tag Filter node. See my link above for finding log files. But I do not understand why you want the logs there. It is clearly stated, that you could try to

Install additional language extensions at File->Install KNIME Extensions

Have you tried that? See the following screenshot for the possible language extensions there.

Are with that the second and third issue solved already?

Regards
Steffen

1 Like

Thanks for provide ideas to get my problem narrow down.

In the log file i found:

and

→ for this error i found a workaround. I download the Model - SPACY_POS_en_core_web_sm-3.6.0 and stored it locally.

But this error is in the Knime.log


As i am analyzing “only” english text/data - is is not necessary i suppose as there is not “english package available”.

At the console console output is now:

MM - something is not working with the “Selected tag” .
Any idea ? Neither in Console.log nor the knime.log provide a hint.

Hi,

thanks for giving the details!
I assume @Artem from @Redfield can help you better here.

Best regards
Steffen

Thanks for your feedback.
Update:
very strange - after “playing” around with the Spacy Model Selector, i am able to run the workflow inclu. Spacy Model 3.6.

But the last node in the workflow with this config:

produced an empty list ? - or this that a correct/it should be like this form ?

Hello @SOESCHEN

First of all I would recommend you to use this workflow if you are interested in the one that was presented at the webinar:

The workflow you mentioned is a part of the NLP course, and it is an exercise that needs to be solved.

One more thing, what Python environment are you using: the bundled or your own custom environment? Perhaps the problem could be that you may be missing some dependencies in case you are not using the bundled environment.
I can see you tried to download the model and use, however the version in 3.6.0 is not supported, while currently only versions 3.2.0 and 3.3.0 are supported (support of 3.5.0 is coming soon).

So please check your settings and provide an update, since it is a bit hard to tell what is wrong.

1 Like

Hello @Artem

thanks for get in touch with me. Yes it is hard to debug it …

:slight_smile: I download the suggested Workflow, changed to “Bundled”

and the workflow is now running with some errors

Blockquote
WARN LoadWorkflowRunnable Warnings during load: Status: Warning: 12_07_2023_KNIME_Refield_Sate_of_the_Art 3 loaded with warnings
WARN LoadWorkflowRunnable Status: Warning: 12_07_2023_KNIME_Refield_Sate_of_the_Art 3
WARN LoadWorkflowRunnable Status: Warning: BERT Model Selector 3:4598
WARN LoadWorkflowRunnable Status: Warning: State has changed from IDLE to CONFIGURED
WARN Spacy Tokenizer 3:4613:0:1321 C:\KNIME\plugins\se.redfield.textprocessing.channel.bin.win32.x86_64_1.1.2.202212230250\env\lib\site-packages\spacy\util.py:865: UserWarning: [W095] Model ‘en_core_web_md’ (3.3.0) was trained with spaCy v3.3 and may not be 100% compatible with the current version (3.4.1). If you see errors or degraded performance, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate
WARN Tag Filter 3:4613:0:1325:0:1323 Selected tag type “SPACY_POS_en_core_web_md-3.3.0” could not be found.
This might happend because it is either a dynamic TagSet that is undefined for the selected column ‘Processed document’
or it is a TagSet defined by a missing language extension!
Install additional language extensions at File->Install KNIME Extensions.
WARN Spacy POS Tagger 3:4613:0:1325:0:149 C:\KNIME\plugins\se.redfield.textprocessing.channel.bin.win32.x86_64_1.1.2.202212230250\env\lib\site-packages\spacy\util.py:865: UserWarning: [W095] Model ‘en_core_web_md’ (3.3.0) was trained with spaCy v3.3 and may not be 100% compatible with the current version (3.4.1). If you see errors or degraded performance, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate
WARN Spacy Lemmatizer 3:4613:0:1325:0:150 C:\KNIME\plugins\se.redfield.textprocessing.channel.bin.win32.x86_64_1.1.2.202212230250\env\lib\site-packages\spacy\util.py:865: UserWarning: [W095] Model ‘en_core_web_md’ (3.3.0) was trained with spaCy v3.3 and may not be 100% compatible with the current version (3.4.1). If you see errors or degraded performance, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate
WARN Spacy Vectorizer 3:4613:0:1325:0:1334 C:\KNIME\plugins\se.redfield.textprocessing.channel.bin.win32.x86_64_1.1.2.202212230250\env\lib\site-packages\spacy\util.py:865: UserWarning: [W095] Model ‘en_core_web_md’ (3.3.0) was trained with spaCy v3.3 and may not be 100% compatible with the current version (3.4.1). If you see errors or degraded performance, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate

Blockquote

So far I do not see any errors here. Is it possible for you to share the workflow?

Another thing you can try is to run the nodes before Tag Filter node and reconfigure this node afterwards, in that case the tag set can be fetched.

I got the workflow now running with the “Bundled” Python env. configuration and Knime 4.7.2…
As you maybe saw below the “News relevance webinar” my comment 1 month ago, it was figure some

I am using the workflow from here

News relevance webinar – KNIME Community Hub

to understand /learn who Knime & Spacy are playing together.

Maybe you can give a start point to get more back round information. As only the workflows it is very time consuming to understand how/what is working/what should be the output.

I guess if you want to have an overview of the Spacy nodes you can take a look at this workflow:

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.