stop-word list is not working

Hello

I have a hebrew stopword list, and for some reason, KNIME would not filter my documents based on this list.

Any help will be great!

I added the stopword list

Hi dimleyk,

this was mentioned in another thread. It seems that the Stopword filter is not working properly with custom lists. The loading of the lists seems not to work. For build in lists the node works fine. I have to investigate this further and reply as soon as I know.

Cheers, Kilian

Any other idea how to work this out?

Is it possible to filter by dictionary? or tag words using the dictionary tagger and then filter these words?

Yes there is a workaround. Read the file with the "File Reader", use the Dictionary Tagger to tag the word. Then use the Tag Filter to filter them out. Make sure that the words are not set unmodifieable by the Tagger, or ignore the unmodifiability flag in the Tag Filter.

The bug will be fixed in the next version.

Cheers, Kilian

Hi there,

I was just testing the stop word filter and noticed it was not working; did a search on forums and got here. The fact is I do have the latest version of Knime, from Dec/2016, and the stop word filter is not working with an external file.

Any thoughts on this?

Thanks!

Gustavo

I think somone has now (3.3.2) corrected the stopword node; it functions well with external files. Thanks a lot to the corrector (kilian?). However please note that, for Turkish characters the flate text file should be saved as ANSI file.

Yes, the issue got fixed already and custom lists can be used in the latest version.

The node takes the standard character encoding, which can be specified in the knime.ini file.

-Dfile.encoding=UTF-8

Cheers, Kilian

Thanks a lot for the correction.

Hi, 
I still encounter a problem with the stop word filter node. I updated KNIME and the analytics extension to the latest version and I modified knime.init file. 

Still, when I use the stop word filter with a custom .txt stop-word-list it doesnt work. I even tried .csv and .rtf
 

I attached a sample workflow with a stop-word-filter.

Cheers, 
Paco 

Hello
Do you have a Knime Stemmer for Hebrew text ?
Also did you manage to flter out stop-words in Hebrew ?

Best
Malik

Hi @malik

no, unfortunately we don’t have a language pack for Hebrew yet.

However the Stop Word Filter node provides an optional input port. With that you can specify your custom stop word list. You can e.g. download a Hebrew stop word list, somewhere and use it via this port.

Cheers
Kilian

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.