Sum by particular columns

I have a table with sales in different country, from Jan 2021 to Dec 2022, for example, I want to sum the sales figure by quarter, how to do it in KNIME?

There are a few ways, depending on your specific goals, but I’d go with the Math Formula node or the Column Aggregator node or the Column Expressions node. For the first 2 options, you’ll need to use a different node for each calculation. The Column Expressions node would allow you to use a single node for all the calculations.

1 Like

That means I need to hard code in the formula that 2021-01-01, 2021-02-01, 2021-03-01 as 2021Q1 and so on…? I try to avoid this as the monthly data maybe updated or I may need to sum up the fiscal year which is from April 21 to Mar 22, etc.

Hi @anguslou,

Find below a possible solution. I made this by using a table with the quarters.

It’s easier to maintain this table (in my opinion), which is not, or less, the case for a hardcoded solution.

Hope it helps.
Br,
Samir

3 Likes

Hi @anguslou ,

I took a similar approach to @SamirAbida, as for this type of job, unpivoting is often a good first step, as it allows you to put your tabulated data into “rows” which are much better to work with.

The derivation of “quarter names” can be done with String Manipulation which allows some mathematical operations, and I noticed that your Quarters appear to be based on Calendar years rather than Financial years, but I created two different String Manipulation nodes to derive each. You can manually adjust the workflow and join which of these is appropriate to the subsequent GroupBy node.

I added a couple of extra columns for testing purposes.
Using “Financial years” the output is as follows:

and as “Calendar years”…

The String Manipulation calculation for “Financial Years” takes a little explaining:

string(
toInt(substr($ColumnNames$,5,2))/4==0
?join(string(toInt(substr($ColumnNames$,0,4))-1),"Q4")
:join(substr($ColumnNames$,0,4),"Q", string(toInt(substr($ColumnNames$,5,2))/4))
)

What this says is if the integer value of the two characters in the “date” starting at position 5 (counting from 0), when divided by 4 equals 0, then we want to subtract 1 from the year portion, and then append “Q4”. Otherwise, use the year portion, and append Q followed by the int value of the month part divided by 4.

The “Calendar year” version of the calculation is similar but simpler, as it doesn’t require the conditional part.

join(substr($ColumnNames$,0,4),"Q",string(toInt(substr($ColumnNames$,5,2))/4+1))`

It uses the year, as supplied and divides the month by 4 as an integer and then adds 1 to it. So 01,02,03 become 1; 04, 05, 06 become 2 and so on…

Sum of individual variable columns.knwf (18.9 KB)

8 Likes

Hi takbb, thank you. This is exactly what I want. I also learn how you are using “?” and “:” in String Manipulation, that is very useful.

3 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.