I’ve just run the example - but the stopword filter doesn’t appear to work - topics include: his, her, and, the etc. The filter option is on Stopword lists: English. Other options except Case sensitive are greyed out. (Use built-in list is ticked, and greyed out). No error messages in the Console. I’m a complete newbie so would appreciate being pointed in the right direction. Thank you.

Hi @StephenG and welcome to the forum -

I’m not sure exactly which workflow you’re describing, but I’ll attempt to answer anyway.

A common issue with the text processing nodes occurs when selecting the appropriate document column. You’ll notice that the first time you apply such a node in a workflow, you’ll be presented with an option to append a new downstream column:

You’ll want to be careful when doing any downstream processing that you select the Preprocessed Document column for any subsequent nodes. For example, in the Document Viewer node:

If you’re inconsistent in your selections, then sometimes it will seem like the text processing nodes aren’t working.

Thanks Scott. The workflow comes with the initial download:
I assumed it would run without needing to configure since it’s a demo. But now I’ve learned some things!

If you run it as it is the output includes stopwords. The Stop Word Filter is set to Append column: Preprocessed Document, but the Topic Extractor is set to read the Document Column. Changing the Topic Extractor to read the Preprocessed Document as you indicated results in output without stopwords. A colleague suggested to change the output of the Stop Word Extractor to ‘replace column’ - and this also works.


Ah, the Example Workflows folder, of course. I was thinking about workflows on the Hub, and forgot about what was right under my nose. :slight_smile:

Glad you got it to work. It sounds like we need to tweak the Example workflows a little bit, since I noticed a deprecated node in there as well. Thanks for pointing this workflow out!

