Counting values with reference to different numbers of rows

navo · November 17, 2021, 4:50pm

Dear all

I have trouble with the following topic:

In column B1 and B2 are numbers from 1 to 5. I want to count how often a Name with different topics have the numbers 1,2,3,4,5.

It should look like this:

So I tried it with groupBy, but it doesnt works. Have anyone of you an idea how I can fix this?

Many thanks in advanced.

Best regards

takbb · November 17, 2021, 6:12pm

Hi @nave, and welcome to the KNIME community.

The approach I chose for this was to divide your table into two datasets, and rename columns B1 and B2 respectively to an arbitrary name “B-Value”. These data sets could then be recombined by concatenation and at this point we can use Group By to count the occurrences of “B-Value” for each name.

The key to transforming data to “tabulated” format as you required in your output is the “Pivoting” node, and the data can then be tidied up so that any “missing” counts are set to zero.

Counting References.knwf (19.8 KB)

(I also added a counter but this was simply so I could re-order the rows later back into the required order since names weren’t alphabetical)

navo · November 17, 2021, 6:48pm

A thousand thanks to you takbb!
It was not the completely correct solution, but it was the hint I needed. Thank you!

HansS · November 17, 2021, 6:56pm

Hi @navo

There is an other way to solve this problem. And that is by using the Unpivot node. See Counting References Unpivoting.knwf (29.4 KB)

Happy KNIMEing
gr. Hans

bruno29a · November 17, 2021, 6:56pm

Hey @navo , I’ve not run @takbb 's workflow, but it’s exactly how I would have done it on my side - “merge” the 2 B columns since we don’t care if the values are from B1 or B2, do a GroupBy to get the counts, and pivot to move from vertical to horizontal.

I’m not sure what’s missing there. This should give the answer you were looking for. Can you elaborate what was missing and how you fixed it?

bruno29a · November 17, 2021, 6:57pm

Nice alternative @HansS !!

navo · November 17, 2021, 7:20pm

Hey all,

The workflow of @takbb gave me at the end always only “1” as counted, so I changed a little bit and came to my desired result (without topic). However, I have now also tested the workflow of @hanss and that works as well.

I share the workflow where I tested both and came to the same result.

… and thanks again.

test_file.knwf (29.3 KB)

bruno29a · November 17, 2021, 7:39pm

Thanks for sharing @navo , I suspect it might having been something in the groupby of @takbb 's workflow.

Yes, @HansS 's workflow is a great alternative way and quite straight forward.

takbb · November 17, 2021, 11:07pm

Oops! My apologies @navo !

Yes @Bruno29a, you are quite right that it was my GroupBy. I hadn’t spotted that by introducing the counter to enable sorting into original order, it mucked up the Group By. Not sure how I didn’t spot that. It was all going so well up to that point!

Anyway, for completeness I haver attached workflow with correction to the group by. I had forgotten to include min(counter) in the group by so every counter was causing a group of exactly one! With that in place, it worked as intended with the row ordering.

Counting References-correction-2.knwf (20.2 KB)

Not that you need this now, given the far more elegant solution from @HansS that I wish I’d thought of… Nice one @HansS !

btw, you may be interested to know that Hans’s solution can be trimmed to just 4 nodes instead of 5

provided you tick this box:

HansS · November 18, 2021, 6:56am

Hi @takbb

Tnx Brian for introducing me the sort option in the Pivoting node. I overlooked it many, many times .

gr. Hans