I need some help with a cluster analysis and I want to investigate whether a customer, that buys A is likely to buy B as well. In my dataset, I have a customer ID, Event ID, title of event and genre of event. In this way I can see, which types of event genre that each customer has bought a ticket to.
How do I use KNIME to perform a cluster analysis, that can examine whether customers that buys tickets to eg. opera are also likely to buy tickets to a rock concert?
I think what you desire can be achieved by making use of association rules.
Here is my suggestion:
Use a GroupBy node and set the ācustomer IDā as the grouping column and in the āManual Aggregationā tab select āEvent IDā then apply the āSetā aggregation method.
After that, pick the Association Rule Learner (Borgelt) and find out how frequently sets of event IDs are bought. Thatās how you can find out which events are bought together most frequently or how often some events are bought together.
In addition to what @armingrudd has suggested, you might want to check some of the workflows dealing with association rules that weāve made available on our Hub:
Thank you, but I dont know how to configure the association rule learner as it gives an empty table or cannot be executed at all.
I dont understand why the Event ID should be used, as the āGenreā is my key here but the issue is that the data is strings and thus not able to be handlet in the cluster nodes.
If you want to check which genres are bought together you can use the āGenreā instead of the event ID.
But in that case, I recommend to use āListā aggregation method in the GroupBy node since customers might buy events with the same genres. Not sure though, maybe you find the āSetā method more useful.
About working with the association rule node and setting the configurations, follow the link provided by @ScottF and you will find several use cases.