Strings To Document - NullPointerException

Hy community and KNIME Team.

When i try to use the “String to Document” node there is always “NullPointerException” error occurs.

I’m use the KNIME Analytics plataform 4.3.2

Can you help me?. Thank in advanced

follow knime.log below:

2021-05-10 23:02:12,425 : ERROR : KNIME-Worker-90-Strings To Document 4:4 :  : Node : Strings To Document : 4:4 : Execute failed: ("NullPointerException"): null

java.lang.NullPointerException
at org.knime.ext.textprocessing.TextprocessingCorePlugin.resolvePath(TextprocessingCorePlugin.java:122)
at org.knime.ext.textprocessing.util.OpenNlpModelPaths.getEnTokenizerModelFile(OpenNlpModelPaths.java:118)
at org.knime.ext.textprocessing.nodes.tokenization.tokenizer.word.OpenNlpEnglishWordTokenizer.(OpenNlpEnglishWordTokenizer.java:81)
at org.knime.ext.textprocessing.nodes.tokenization.tokenizer.word.OpenNlpEnglishWordTokenizerFactory.getTokenizer(OpenNlpEnglishWordTokenizerFactory.java:73)
at org.knime.ext.textprocessing.nodes.tokenization.TokenizerPool.(TokenizerPool.java:97)
at org.knime.ext.textprocessing.nodes.tokenization.DefaultTokenization.createTokenizerPool(DefaultTokenization.java:73)
at org.knime.ext.textprocessing.nodes.tokenization.DefaultTokenization.lambda$1(DefaultTokenization.java:92)
at java.util.HashMap.computeIfAbsent(HashMap.java:1127)
at org.knime.ext.textprocessing.nodes.tokenization.DefaultTokenization.getWordTokenizer(DefaultTokenization.java:92)
at org.knime.ext.textprocessing.data.DocumentBuilder.(DocumentBuilder.java:135)
at org.knime.ext.textprocessing.nodes.transformation.stringstodocument.StringsToDocumentCellFactory2.getCells(StringsToDocumentCellFactory2.java:152)
at org.knime.core.data.container.RearrangeColumnsTable.calcNewCellsForRow(RearrangeColumnsTable.java:568)
at org.knime.core.data.container.RearrangeColumnsTable$ConcurrentNewColCalculator.compute(RearrangeColumnsTable.java:787)
at org.knime.core.data.container.RearrangeColumnsTable$ConcurrentNewColCalculator.compute(RearrangeColumnsTable.java:1)
at org.knime.core.util.MultiThreadWorker$ComputationTask$1.call(MultiThreadWorker.java:442)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:334)
at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:210)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)

Debug Console Log

INFO  KNIMECorePlugin                 Setting console view log level to DEBUG
DEBUG NodeContainerEditPart            Strings To Document 4:4 (CONFIGURED)
DEBUG ExecuteAction                   Creating execution job for 1 node(s)...
DEBUG NodeContainer                   Strings To Document 4:4 has new state: 
CONFIGURED_MARKEDFOREXEC
DEBUG NodeContainer                   Strings To Document 4:4 has new state: 
CONFIGURED_QUEUED
DEBUG NodeContainer                   testes 4 has new state: EXECUTING
DEBUG Strings To Document  4:4        Strings To Document 4:4 doBeforePreExecution
DEBUG Strings To Document  4:4        Strings To Document 4:4 has new state: PREEXECUTE
DEBUG Strings To Document  4:4        Adding handler 24817332-5f53-4b0f-a462-40ec0a237d23 
(Strings To Document 4:4: <no directory>) - 2 in total
DEBUG Strings To Document  4:4        Strings To Document 4:4 doBeforeExecution
DEBUG Strings To Document  4:4        Strings To Document 4:4 has new state: EXECUTING
DEBUG Strings To Document  4:4        Strings To Document 4:4 Start execute
DEBUG Strings To Document  4:4        Creating buffered document file store cell factory!
DEBUG NodeContainer                   ROOT  has new state: EXECUTING
DEBUG Strings To Document  4:4        Using Table Backend "BufferedTableBackend".
DEBUG Strings To Document  4:4        KNIME Buffer cache statistics:
DEBUG Strings To Document  4:4        	1 tables currently held in cache
DEBUG Strings To Document  4:4        	18 distinct tables cached
DEBUG Strings To Document  4:4        	17 tables invalidated successfully
DEBUG Strings To Document  4:4        	0 tables dropped by garbage collector
DEBUG Strings To Document  4:4        	0 cache hits (hard-referenced)
DEBUG Strings To Document  4:4        	18 cache hits (softly referenced)
DEBUG Strings To Document  4:4        	0 cache hits (weakly referenced)
DEBUG Strings To Document  4:4        	0 cache misses
DEBUG Strings To Document  4:4        Initializing tokenizer pool with 10 tokenizers.
DEBUG Strings To Document  4:4        Initializing tokenizer pool with 10 tokenizers.
DEBUG Strings To Document  4:4        Using table format 
org.knime.core.data.container.DefaultTableStoreFormat
DEBUG Strings To Document  4:4        reset
ERROR Strings To Document  4:4        Execute failed: ("NullPointerException"): null
DEBUG Strings To Document  4:4        Strings To Document 4:4 doBeforePostExecution
DEBUG Strings To Document  4:4        Strings To Document 4:4 has new state: POSTEXECUTE
DEBUG Strings To Document  4:4        Strings To Document 4:4 doAfterExecute - failure
DEBUG Strings To Document  4:4        reset
DEBUG Strings To Document  4:4        clean output ports.
DEBUG Strings To Document  4:4        Removing handler 24817332-5f53-4b0f-a462-40ec0a237d23 
(Strings To Document 4:4: <no directory>) - 1 remaining
DEBUG Strings To Document  4:4        Strings To Document 4:4 has new state: IDLE
DEBUG Strings To Document  4:4        Creating buffered document file store cell factory!
DEBUG Strings To Document  4:4        Configure succeeded. (Strings To Document)
DEBUG Strings To Document  4:4        Strings To Document 4:4 has new state: CONFIGURED
DEBUG Strings To Document  4:4        testes 4 has new state: CONFIGURED
DEBUG NodeContainer                   ROOT  has new state: IDLE
DEBUG NodeTimer$GlobalNodeStats            Successfully wrote node usage stats to file: / 
 home/natanael/works/knime-workspace.4.3.2/.metadata/knime/nodeusage_3.0.json

Hi @natanaeldgsantos -

Is this happening with the Strings to Document node all the time, or only with a particular CSV file? It seems like the tokenizer is running into a problem.

Would it be possible for you to provide a sample CSV file that reliably reproduces the problem, for our devs to investigate further?

Hi Scott

Hi

yes the error occurs at all times, with several different columns, from different data sets.

follows an attached example workflow

testes.knwf (3.5 MB)

I’m not able to reproduce the error with the workflow you’ve given. Having said that, I have an idea about what it might be…

You are currently using the default configuration for the Strings to Document node, which is using the “tweete” field for the Title Column, Full Text, and Authors. I would suggest using an empty string for the Title Column (since you don’t really have a title in the dataset that I can see) and the “nickname” field for the Authors.

In particular, we’ve recently seen the Authors field throw NPEs when it is fed data that is relatively large. So if you still have problems with the node you can try unchecking the Use authors from column box altogether.

Does that help?

EDIT: I added a ticket in our system to address this issue (AP-16723), with a +1 from you.

1 Like

Hi @ScottF, thank you for your support

I applied all your suggestions and the same error occurs. It is true that the error persists even when using other data sets.

That is, at least in this version of my KNIME analytics platform I am unable to use this feature.

I will be waiting for the ticker’s response.

thank you so much

If you uncheck the Use authors from column box, does it then work? Or do you get an NPE in all cases?

Yes i try this options, in all cases NPE occurs.

Very strange. I’m using AP 4.3.2 in Windows 10, and I loaded up your workflow and can’t reproduce. Are you on Linux perhaps? If so, what distribution?

I’m just thinking out loud here, but the other thing to try might be to change the encoding in the CSV Reader. I’m using the OS Default - maybe you are using something else?

Yes, Ubuntu Linux 18.04.5 LTS

hy @ScottF after install the new version 4.3.3 the error has been resolved.

Thank you

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.