Regardless of using the reference row filter or the set operator to find terms common to training and test documents, there's a particular term that is found in training but not testing.
What could be the reason? As of now I have to remove this term from training to build classification models. Is it a case of duplicate rows in training ?