I was wondering if there is a way to remove POS tags from documents once they have served their purpose. I want to do this so that when I apply TF-IDF on the processed documents, different senses of the same term are treated identically.
This seems like a pretty logical thing to do, but I can't yet see any way to do it. There doesn't appear to be any 'tag stripper' node, and I don't know how to adapt any other node to perform the same function.
The simplest workaround seems to be to convert the documents into strings (via the document data extractor), then convert the strings back into documents. This isn't too painful, but if there's a simpler way, I'd love to know about it.