multi label text classification

metinergoktas · April 29, 2024, 7:26pm

Hi there,

I need to build a text classification model, but since some of the texts I will classify may belong to more than one class, some texts need to be assigned to more than one class as a result of classification. Can you guide me on how I can do this. If there is a sample flow or model about this, I would be very grateful if you can share it.

HansS · April 29, 2024, 7:58pm

Hi @metinergoktas

The first question that comes to my mind is, do you already know which classes you want to predict, or should the classes arise from the data?

gr. Hans

metinergoktas · April 29, 2024, 8:07pm

Hi @HansS

my classes are known. Classes are certain. But some instances may be member of more than class. I have not a certain model on my minds to accomplish these multi label classification task.

thanks.

HansS · April 29, 2024, 8:20pm

In that case (without knowing your use case) I would train a separate model for each class. So that each instance can belong to multiple classes.
Do you have a labeled dataset for every possible class outcome, that can be input to train your model?

gr. Hans

metinergoktas · April 29, 2024, 8:25pm

thanks for your kindly reply @HansS

Yes I have labeled data set that I could train a model. But I could not clearly understand your expression : “I would train a separate model for each class. So that each instance can belong to multiple classes.”

Could you give me an example.

HansS · April 29, 2024, 8:32pm

If you want to predict whether someone has an opinion about, for example, basketball, baseball, soccer or swimming, then that person (instance) can have an opinion about several sports (classes). My approach would then be to train a separate model for each sport that provides insight into whether people have an opinion about it or not.
Does this help?

metinergoktas · April 29, 2024, 8:55pm

So, you would convert to problem into binary classification for each class. If I have an 8 classes, I will train a model for each 8 classes then some instances may be the member of several classes. Thus, actually, we divide the multi label classification problems into several binary classification problems. Could I understand right?

system · May 7, 2024, 6:57pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.