error node in topic extraction

begumkaplan · May 10, 2020, 6:17pm

Hello,

I keep on getting an error in the topic extraction workflow. I attached the screenshot of the error along with my workflow and would be glad if you can help me with that. Thanks!

topic extraction beg 1.knar.knwf (328.5 KB)

ScottF · May 11, 2020, 4:30pm

Take a look inside the preprocessing component (which you can do by right-clicking and selecting Component --> Open, or by simply Ctrl-double clicking). You will see that as the error message suggests, there is a break in the workflow between the Stanford Lemmatizer and the Number Filter.

begumkaplan · May 11, 2020, 5:03pm

Oh thank you! I corrected it and now it is working however after the first preprocessing, now I am having problems with the case converter node. I want to include only the preprocessed document which I attached the screenshot but then after the case converter node, there is a stop filter node and it does not give me a choice to select the preprocessed document. I could only select “document” there and when I did that then I got error from the second preprocessed node. I attached the screenshots for those. I would be glad if you can help me with these as well.

ScottF · May 11, 2020, 9:07pm

It turns out that there are two Case Converter nodes - one for dealing with strings in KNIME data tables (from KNIME Core), and the other for handling text in documents (from the KNIME TextProcessing Extension). You want the latter:

2020-05-11 16_06_44-KNIME Analytics Platform

Sorry for the confusion.

begumkaplan · May 11, 2020, 11:59pm

I see! I used the latter one but still having problems with the exclusion. I want to exclude two words “http” and “.html” in which both do not have meanings but occurring a lot in the document. However, even though I included them in the stop filter node, I am still seeing those among the topic terms. Not sure where I am doing wrong. I attached the screenshot along with my workflow. I really appreciate your help.

topic extraction beg 1.knar.knwf (331.1 KB)

ScottF · May 12, 2020, 7:38pm

Looks like it’s because you are creating a column called “new column” in the Stop Word Filter. You want to replace the existing Preprocessed Document column instead.

begumkaplan · May 12, 2020, 9:35pm

Thank you very much! Now it works. I also have one more question related to this workflow. How can I create a tag cloud for each topic term? Currently, it is creating only one tag cloud and having few terms from each topic. Thanks in advance!

ScottF · May 13, 2020, 10:07pm

Here is the result data table from the final component. This is what you need, yes? Maybe you were looking at the interactive view of the component (which only shows one tag cloud at a time)?

begumkaplan · May 13, 2020, 11:48pm

Exactly, this is what I am looking for. Thank you! Is there a way to view this in a large format with a better image quality?

ScottF · May 14, 2020, 8:18pm

If you drag the edges of the rows/columns in the table shown above, the images will adjust in size to fit the space.

system · June 2, 2023, 9:42pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.