Text Classifying/Text Mining Question- Improve Accuracy

Hello,

I am trying to classify text (ranging from 1 word to a few sentences) into ~8 different classifications. I am using the textclassifierlearner and textclassifierpredictor nodes. The classification isn't very accurate even though my learner has ~1300 examples of classified text. Do you have any tips to increase my accuracy or any suggestions for another node that would work better?

Thank you!

In the following, I assume that you're talking about the Palladian text classifier.

  • Have you played with different feature settings?
  • Have you tried different preprocessing options?
  • Have you tried different scorers?
  • Are the classes equally balanced?
  • How do you assess accuracy? Do you look at the result, or do you use a test set?
  • Have you checked about frequent confusions? (understanding, which categories are often confused, helps you in optimizing your settings)

​​​​​​In general, despite its simplicity, the PTC works very well, compared to much more complicated setups. But if your goal is to optimize for few percent/per mille, the PTC probably not the right tool for you.

Can you go in detail, what kind of data you want to classify?

-- Philipp

Can we use Palladian Text Classifier when we have to 5 classes? Like we want to predict whether a sentence is assertive, directive, commissive, declarative,expressive.

Can we use Palladian Text Classifier when we have to 5 classes?

Yes.