Counting from Top 5

Hi all,

I have a massive dataset with “City” names and with genders. Something like that:

| London | Male |
| Paris | Female |
| London | Male |
| London | Female |

From the same dataset I have a Top5 cities. Obviously, it’s always changing, depending on the input data. I’d like to count the people by gender by Top 5 cities. I’d like an output somethign like this:

| City | Male | Female |
| London | 110 | 50 |
| Paris | 10 | 150 |

I have no idea how to start it but I guess I will need a loop, maybe? :slight_smile:

All help appreciated
Cheers

I think Groupby Node help solve this problem :smiley:

1 Like

Thanks for the quick response.
Unfortunately, it’s not the solution. I dont know the Top 5 in advance, it’s coming from on a different flow.

@jtamasi Could you share your workflow? :smiley:

After the groupby, try the Top K selector node https://nodepit.com/node/org.knime.base.node.preproc.topk.TopKSelectorNodeFactory
Gr. Hans

3 Likes

Thanks for everybody for the help. Eventually I figured it out.
“Table Row To Variable Loop Start” with a “Rule-based filter” did the trick :slight_smile:

I asked before Ive tried it hard :slight_smile:

Cheers

4 Likes

Hi @jtamasi,

if I got you right you are looking for Pivoting node :wink:
After it you can use above mentioned Top K Selector node.

Br,
Ivan

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.