I am trying to adopt Simple Document Classification Using Word Vectors workflow for a smallish data set of news articles.
Like in the provided example, I am creating a table with a content column and a lable column. The table contains about 2200 rows (news articles) mapped to 10 categories and is used as the input to Doc2Vec Learner. As in the example workflow, this is then followed by the Vocabulary Extractor node. So far so good.
When I run this and look at the Label output of Vocabulary Extractor however, I only get a single Label instead of the expected 10. I don’t understand what it going on here, and I was not able to find more detailed relevant information in the documentation of the forum.
Btw, the only other change I made to the sample flow was to remove the “Rules” node in the “Read Training Documents”, which as far as I understand just shortens the label names to a three letter acronym but does not seem essential.
Any help would be greatly appreciated.