I’m sorry to bother you but I’m feeling particularly dumb and stuck, I hope you can help me.
I have tweets about AI.
I would like to analyse the tweets in two separate groups:
- Group 1 where the word ‘artificialintelligence’ is mentioned but NOT the word ‘machinelearning’.
- Group 2 where ‘machinelearning’ is mentioned but NOT the word ‘artificialintelligence’. How do I achieve this?
many thanks for your help!!
The Rule Engine node can help you out here:
$column1$ LIKE "*artificialintelligence*" AND NOT $column1$ LIKE "*machinelearning*" => "GROUP1"
$column1$ LIKE "*machinelearning*" AND NOT $column1$ LIKE "*artificialintelligence*" => "GROUP2"
TRUE => "UNK"
You can subsequently filter or split the desired group.
THIS IS ABSOLUTELY BRILLIANT!!! Thank you very much!!!
I can see this working on string variables (each tweet).
Will it work on Preprocessed documents where I have coded the various versions of ‘artificialintelligence’ (AI, Artificial intelligence etc) under just one ‘artificialintelligence’? When I tried this, it gave me all 190K tweets/rows under just one category…
Many thanks for all your assistance!
You can use
to specify additional rules for AI, Artificial intelligence etc…