Column Aggregator by Count Not Working

Hi there,

I am attempting to get total counts of values for specific variables within a data set. I originally tried the Groupby node, but was told that the Column Aggregator node would work just fine as I only want to get counts of total number of values for one variable. Each time I configure the Colum Aggregator, I select the Aggregation Column I wish to manipulate and then I select “count” within the options settings. Every time I execute the node, the table returns the same amount of rows I attempted to aggregate and shows clearly matching values as separate values. Every value has a “1” in the count.

Below, I’ve attached two screenshots of the values not matching correctly.

You’ll also see my entire Knime workflow at the bottom.

The variable where I am trying to count values is a variable from the FEC (Federal Elections Commission) that classifies congressional spending into categories. Each number in column 16 is one of those categories.


congressional_disbursments.knwf (97.8 KB)

Hey @cholbert

thanks for providing us with the information and the workflow (unfortunately we cannot read the files which are located on your PC :))! Anyhow, if I understood correctly you want to get a count for all occurrences of i.e. 004 right? For this scenario you should use the Group By Node and not the Column Aggregator. The Column Aggregator in your case counts the columns in your table per row which will always be 1. For two columns you would get Count=2 or for a Max aggregation the highest number per row. To make use of the GroupBy Node you would need to create some sort of dummy column to do a count aggregation on this column.

Hope this helps

Best regards
Lars

2 Likes

As @laaaarsi explained well and as a little hint
column aggregator works column wise not row wise
br

1 Like

@laaaarsi Thank you so much for the quick reply. And apologies for sharing a file located on my PC, I was unaware of the Knime Hub sharing function.

Yes, you do understand me correctly. Okay. I’m sort of confused about the dummy column portion. How would I go about creating that dummy column?

Yes, very helpful. Thanks for being willing to help a stranger.

Here is that Knime Hub URL: congressional_disbursments – KNIME Community Hub

And below is a screenshot of the point after which I’ll need to add this group by node with the dummy column.

Sincerely,
Connor

@Daniel_Weikert Thank you. Yup, you can tell this is my rodeo with Knime and I probably need to stop listening to Chat GPT

You could use for instance the Constant Value Column – KNIME Community Hub node and set some arbitrary number. Note you need that in case you only have one column. After that you can group by your “CATEGORY” column and do the aggregation (count) on the dummy column.

Little example in the workflow below

single_column_group_by.knwf (7.7 KB)

1 Like

This workflow was proposed by ChatGPT? Or only the specific Node?
br

Thank you. I’ll keep that method in my back pocket.

I also found another way of doing it by using the value counter node.

Below, you can see the node and then a table of occurrences like I was looking for.


1 Like

@Daniel_Weikert That specific node. It’s somewhat decent at giving directions, but clearly was off the mark with the column aggregator. The value counter node solved my problem.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.