Strip each word from list of words from each document from a list of documents

I'm trying to update each document in a list of documents by replacing all occurances of each word in a list of words in the document in place with nothing (striping the word). The equivalent of:

for ($i = 0; $i < $#documents; $i++) {
     my $document = $documents[$i];
     foreach my $strip_word (@strip_words) {
         $document =~ s/$strip_word//;
     }
     $documents[$i] = $document;
}

The list of documents is coming from a Stop Word Filter and the list of strip words is coming from a GroupBy. I've tried nested loops using String Manipulation but end up stripping each word from each document for each word (106 times for 1345 documents) and end up with 142,000 + rows. I've tried loop column appender but that doesn't work. I need to basically feed the sentence back in on itself, do I need to use Java Snippet for this? (from what I understand that only works at a row level as well).

You would need a recursive loop for this as I understand. Not sure how well documents are supported, probably well. :) (Although with a Java Sniippet it might be more performant as a preprocessing step for the documents.)

Cheers, gabor
 

I've tried to do this using transpose (to convert the list of strip words to columns for each document row) and then use a column list loop + string manipulation to replace all instances of each row# in the document and reassign to document. Something like this I was hoping would work:

But I'm not sure what to use to get to the currentRowName value so I can use that in the replace() function. I'm not sure how to use a recursive loop in this sense either?

In the end I managed to achieve this through Transpose + Column Aggregator (Concatenate) + Perl Scripting:

 # Use methods below if you need to dynamically get values
 # @column_names = sort { custom sort here } grep {/^Row\d+/} keys(%column);
 # @column_names = grep {/^Row\d+/} keys(%column);
 @strip_words = sort(split(";", $column{'Concatenate'}));
 $document = $column{'Document'};
 foreach $strip_word (@strip_words) {
   $document =~ s/\b$strip_word\b//g;
 }
 return $document;

Which seems to have a pretty decent performance. I think a node that does something similar (and ditches the concatenate column so I don't have to use a column filter myself) would be really useful? Maybe even adding in-place replacment.