Hi there,
I have some workflows that used to work just nicely, but recently data input has been massively scaling up and my GroupBy nodes were first to notice due to the default 10.000 maximum unique values per group. I remember a workshop last year where I’ve asked for a workaround and was told to just write in a high number, but that suggestion impairs performance when there are a lot less unique values and doesn’t scale when there are more.
My solution now is using an Extract Table Dimension node in parallel and feeding the row count as variable to fill in as the maximum unique values. I’m sure I’m not the only one who was distressed by GroupBy and Pivoting nodes failing to operate when scaling up, so maybe it’s a good idea to put “use number of rows as max unique values” as a feature into the GroupBy and Pivoting nodes? In the worst case it might take a lot longer than it should, but even that is way less disruptive than noticing after a weekend that the workflow’s output is useless because the very first GroupBy node failed to count properly.
Thoughts?