I have a table that I read in from a File Reader node. The table has 5 columns, one of which is named Word. I want to apply the Snowball Stemmer node to just the Word column. How can I do this? I tried using a Strings To Document node, and then passing that into SnowBall Stemmer. The problem is the output of Snowball Stemmer node does not include the other 4 columns from my original table. It only includes "Document" and "Original Document" columns.
Does anyone know how I can get around this?
As I work more with Text Processing nodes, I've realized that this is a common "feature" of all the Preprocessing nodes: they end up dropping all the columns except Document and Orig Document. How can I get them to keep the other columns, too?
I'm running into the same problem!
I can picture a workaround where I retrieve the original info by indexing against the document column (I assume this is possible!), but I don't understand why the original columns couldn't have been retained in the first palce.
I would love to hear if you have a solution.
unfortunately the preprocessing nodes drop additional columns. Preprocessing nodes only work on document columns or term columns. The only way to work around this problem is to join by the original document and add the columns later on which are filtered by the preprocessing nodes. Sory about that.