I’ll try to brwose the HUB thanks.
The dataset I’m using is comprised of 3 columns:
-polarity: the values here can be either 1 (for reviews that had either 1 or 2 stars) or 2 (a score of either 4 or 5 stars), revies with a value of 3 were eliminated.
-title: a text which is the actual title left for the review
-body: the actual text body of the review left
My idea was to train an alghoritm that could recognive the “sub-topics” in the body column (e.g. the review might be about some headphones and in the text the reviewer could point out negative/positive aspects of single components of them "bad construction, good sound quality etc. etc.).
So I would need something to cut the body into different sentences and then recognize the subject and the related “subjectivy/opinion words”.