Do Math sum() operation on columns defined in a table

I have a KNIME flow that does segmentation and calculations on data from a number of retail clients. The data is uniform and the flow (which is actually 4 separate flows) works really well. The average data set is about 4 million rows per day. The problem I have is to calculate product category total per customer (about 100’000) when each client has a different product category and sub-categories.

Currently, I have Math nodes that do long addition sums to calculate the totals, but it is not sustainable. Is there possible a way for me to define which columns to SUM so that I only need to maintain a table for each client?

What I would like to do is create a table the category and sub-category columns and use the list to do the math instead of trying to maintain long sum sentences. using the list of columns I would end up with either 2 maths nodes or a column expression node
with an expression that looked something like …

column(“Chicken”)+column(“Beef”)+column(“Pork”)+column(“Lamb”)+column(“Ostrich”)

tC/.

It may not be flexible enough, but take a look at


node.

I’m probably missing something here, but could you use a Group By node to group by customer, category, and sub-category, and aggregate on the all the relevant columns using “sum”?

Hi @TigerCole,

Here is what you can do to have the formula based on your table:

  • Use the String Manipulation node to add “$” (dollar sign):
    join("$", $subCategory$, "$")

  • Use the GroupBy node to Concatenate subCategory (or category) column and input “+” (plus sign) as delimiter.

  • Finally, the Table Row to Variable node converts your formula to a flow variable which can be passed to the Math Formula node.

:blush:

4 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.