Results of Stanford Tagger get lost when using a Python script Node

#1

I am creating a Topic Modelling Process, where I use the Stanford Tagger and Tag Filter.

I wrote a stemming function in python because I ran into some problems with the Snowball Stemmer.
However after processing the text with the python node the results of the tagger get lost and the Tag Filter does not work anymore.

Is threre a way that the python node doesnt delete the results of the tagger?

My stemming code for recreation of the problem:

def Stem(Text, Url):
    stemmer = Cistem('deutsch')
    
    t = [s.replace('"', ' ') for s in Text]
    o = [s.replace('-', ' ') for s in t]

    sl = []
    for line in o: 
        sp = line.split()
        sl.append(sp)

    st = [[stemmer.segment(s)[0] for s in l] for l in sl] 

    sx = []
    for line in st:
        sx1 = ' '.join(line) 
        sx.append(sx1)
        return sx
0 Likes

#2

Hi @gnime -

A couple of questions for you:

  • What specific problems did you run into with the Snowball Stemmer?
  • Can you post a sample workflow, including your code above, that works on a dummy dataset? Then we can dig into the tagging issue a bit more.

Thanks!

1 Like