Compare YOY or Month on Month


I am new to knime and find this software very fascinating.

What i am after?
need to compare Month On Month for 12 Month Data sets.
For Eg Base Month July 21 against August 21 and July 21 against September 21 and so on.
Comparison for Sales by customer.

  1. i have csv data sets for each month with around 200000 rows. (I am using Csv Reader node 12 times)
  2. I have used Group by node on each Csv reader node ( to group each Month by customer, State etc)

Now i am not sure what to do next.

What i want is to line up sales for EACH MONTH next to July 21 customers.

In excel one can use vlookup to line up sales for each month and create a new column.

Thank you for some advise.

Hello and welcome to the community!

  1. You may not have to use 12 csv reader nodes if you select “files in folders” instead of “file” (warning: your files must be standardized for this to work).

  1. Great

EACH MONTH: I am not sure what you expect the final output to be. Could you attach a screenshot of what you data looks like in Excel before and after?

If you provide data and a workflow users on the forum can usually help you solve a problem a bit faster~

Attaching an excel file with input and expected output also helps like what this user did: Splitting multiple rows by column value to append according to year and month

Thank you

1 Like

With files in folder you would stack all your 12 months vertically in one gib table. With lag column you could potentially compare your month values

If you use a loop with column append end node you have them stacked horizontally (similar to when you use joiner node for all your 12 files and one or more join criteria, e.g. customer name). Then you can calculate difference between the columns
as already pointed out by victor_palacios data sample is certainly helpful here.

1 Like

Hello Victor and Daniel

Thank you for your reply.

Please find attach an example of what i am after.

Book1.xlsx (43.4 KB)

1 Like

Hello @Sid_melb and welcome to the KNIME forum.

Just trying to focus a bit more in the challenge (once I saw your example). We had a post in the forum happened last December month. The challenge was also focused in retail data and difference calculations.

There were some differences with your described challenge:

  1. the requested granularity was based on a month by month basis (instead of a pinned reference month)
  2. the target field was the product instead the client
  3. the differentials were presented as quantities instead of percentages

However I think that there are few similarities with that challenge, and maybe you can get some insights from it, aiming to plan your next steps.


1 Like

You can see a simple example here where I used your excel sheet to make sure my Quantity YOY matched yours with 3 nodes (I did something silly with reading in the data, but you can ignore that and just check out the 3 nodes Sorter, Lag Column, and Math Formula which will replicate what you did in Excel.)

Screen Shot 2022-07-01 at 3.00.48 PM

This is a starting point and other forum members have pointed you to other threads with similar topics.

The Transpose node may also help when you read several files at once but the format can be adjusted with other nodes as well to replicate this simple example.

1 Like

Hi Victor and [gonhaddock]

Thank you for your response

As per the below screenshot if i compare the quantity from month Feb22 and April22. There is 0% increase. This does not reflect in the new column.

What if i keep Jan 22 as the base.
I have redrafted the excel sheet.
Book1.xlsx (43.9 KB)

Hi Just wondering if anyone found any solution?

The lag column gives you the lag specified. If you want a different lag from -1 then change the setting in the node. If you want to fix it to a base you can simply add this base number as a constant column and then divide by this column value.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.