Spacy Tokenizer error

mauuuuu5 · December 11, 2022, 11:42pm

Hi everyone, I am executing this workflow but I got the following error.

ERROR Spacy Tokenizer      4:308      Execute failed: Executing the Python script failed: Traceback (most recent call last):
  File "<string>", line 2, in <module>
  File "C:\Program Files\KNIME\plugins\se.redfield.textprocessing_1.1.1.202212060306\py\SpacyNlp.py", line 59, in run
    result_table =  nlp.process_table(input_table, column)
  File "C:\Program Files\KNIME\plugins\se.redfield.textprocessing_1.1.1.202212060306\py\SpacyNlp.py", line 22, in process_table
    for batch in input_table.batches():
AttributeError: 'ArrowDataSource' object has no attribute 'batches'

I have Anaconda installed in my PC with Spacy library with a syncronized enviroment with Knime.

Any help will be appreciated,

Cheers

goodvirus · December 13, 2022, 11:37am

Hi,

I got the same problem as of this morning. I use KNIME 4.7.
My workflow from yesterday works fine, but as of today the same workflow, with the same data breaks…

Best regards,

Paul

nemad · December 13, 2022, 12:18pm

Hello @mauuuuu5 and @goodvirus,

thank you for reporting this issue.
We are working on a fix for this problem in 4.7.

Best regards,
Adrian

goodvirus · December 13, 2022, 2:37pm

HI nemad,

this is great! Buy the way there also seems to be an Error in the BERT-Nodes, which seem to be related:

ERROR BERT Classification Learner 5:392      Execute failed: Executing the Python script failed: Traceback (most recent call last):
  File "<string>", line 2, in <module>
  File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert_1.0.1.202212060241\py\BertClassifier.py", line 102, in run_train
    classifier.train(input_table, class_column, batch_size, epochs, optimizer, progress_logger, fine_tune_bert, validation_table, validation_batch_size)
  File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert_1.0.1.202212060241\py\BertClassifier.py", line 54, in train
    self.model.fit(x=[ids, masks, segments], y=y_train,epochs=epochs, batch_size=batch_size,
  File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert.channel.bin.win32.x86_64_1.0.1.202212060241\env\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\spo\AppData\Local\Temp\__autograph_generated_file_1amc1d7.py", line 15, in tf__train_function
    retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
ValueError: in user code:

    File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert.channel.bin.win32.x86_64_1.0.1.202212060241\env\lib\site-packages\keras\engine\training.py", line 1051, in train_function  *
        return step_function(self, iterator)
    File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert.channel.bin.win32.x86_64_1.0.1.202212060241\env\lib\site-packages\keras\engine\training.py", line 1040, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert.channel.bin.win32.x86_64_1.0.1.202212060241\env\lib\site-packages\keras\engine\training.py", line 1030, in run_step  **
        outputs = model.train_step(data)
    File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert.channel.bin.win32.x86_64_1.0.1.202212060241\env\lib\site-packages\keras\engine\training.py", line 890, in train_step
        loss = self.compute_loss(x, y, y_pred, sample_weight)
    File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert.channel.bin.win32.x86_64_1.0.1.202212060241\env\lib\site-packages\keras\engine\training.py", line 948, in compute_loss
        return self.compiled_loss(
    File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert.channel.bin.win32.x86_64_1.0.1.202212060241\env\lib\site-packages\keras\engine\compile_utils.py", line 201, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert.channel.bin.win32.x86_64_1.0.1.202212060241\env\lib\site-packages\keras\losses.py", line 139, in __call__
        losses = call_fn(y_true, y_pred)
    File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert.channel.bin.win32.x86_64_1.0.1.202212060241\env\lib\site-packages\keras\losses.py", line 243, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert.channel.bin.win32.x86_64_1.0.1.202212060241\env\lib\site-packages\keras\losses.py", line 1930, in binary_crossentropy
        backend.binary_crossentropy(y_true, y_pred, from_logits=from_logits),
    File "C:\Users\spo\Desktop\KNIME\KNIME\plugins\se.redfield.bert.channel.bin.win32.x86_64_1.0.1.202212060241\env\lib\site-packages\keras\backend.py", line 5283, in binary_crossentropy
        return tf.nn.sigmoid_cross_entropy_with_logits(labels=target, logits=output)

    ValueError: `logits` and `labels` must have the same shape, received ((None, 19) vs (None, 1)).

nemad · December 13, 2022, 3:17pm

Hello goodvirus,

yes that’s quite likely and I’ll look into that problem next.
For spacy there is already a fix under review (Adapt to backend changes by AtR1an · Pull Request #8 · RedfieldAB/spacy · GitHub) and I hope to get it out soon.

Best,
Adrian

nemad · December 23, 2022, 9:36am

Quick update: The bugfix releases for the BERT and spaCy extensions are now available on the 4.7 KNIME Community Extensions (Experimental) update site.

Merry Christmas,
Adrian

system · December 30, 2022, 9:37am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.