I have a dataset as follows:
cookie ID | WebpageType | other vairable |
---|---|---|
12345 | news, news, sports, fashion | |
12345 | news, blog, sports | |
55555 | fashion, food, women, | |
88888 | food, food, food, livestyle |
What i want is something alog the line of this:
cookie ID | news | sports | fashion | blog | food | women | livestyle | ||
12345 | 3 | 2 | 1 | 1 | 0 | 0 | 0 | ||
55555 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | ||
88888 | 0 | 0 | 0 | 0 | 3 | 0 | 1 | ||
I have tried the GroupBy node, goruping the cookie Id´s and aggregating the column of WebpageType with unique concatenate with count, and its the closest i have come, but i cant use that for the clustering i want to do afterwards. Can anyone help?