I’m accessing a table with all sales on every product. Now I like to generate a list of top selling articles and the changes in the last 30, 60 and 90 days.
Units sold (total): 16
Last month: 1
Delta 30 days: -5
Delta 60 days: -8
Delta 90 days: -14
Hi @sven-abx , I put something in relation to your last comment
Since I don’t have a good sample data from you and I don’t know how you are doing your calculation, I put something together that shows how to access the columns dynamically. You can adapt it to your use case. The workflow looks like this:
I’m not 100% sure but my intuition tells me it will be much easier if you have the data in “rows instead of columns” eg. duplicate aritklenr. and then a sales column and months column (=constant column names). Then you sort on artikelnr and months ascending and use 3 lag columns to calculate differences.
thanks for your reply. first i was thinking to do it the way you describe, but then i fiured out, that it is possible no sales happend in a month and this month is then missing.
thanks for your reply, i will give it a try.
@ipazin maybe this helps to clearify what i like to do:
i want to calculate the changes (delta) in sales from 30, 60 and 90 days to today.
example: in november (11) 1 item of this article was sold and in october (10) 6. so sales decreased about 5 items.
the difficult thing is: the column names can change, because the workflow is executed once a month and loads the data of the previouse months. on the first of december i fetch tha data of november, october, september and august. in january december, november, october and september and so on.
I do understand the logic but don’t know how your input data looks like, whats output format you want and are only column names changing or also number of columns is changing. Without that we can only guess…
Hi @kienerj , the lag option is a good idea, but the sorting is an issue, especially if you are doing end of last year + beginning of this year.
As per @sven-abx 's example, for February, you would be taking these months: 10/11/12/1. So technically speaking, it should be sorted as 10, 11, 12, and then start over with 1, 2, 3, etc… If you sort, it would actually be 1, 2, 3, …, 10, 11, 12. We don’t see the data to know if there is any value that indicates which year is the months for. If there is some sort of year, then yes, you can sort properly. If not, then the sort would not work.
@sven-abx I think it would be best if you could provide us with an example of the changing names. It might be an artificial one that would demonstrate missing names also.
In any case there will have to be some indication, maybe information in a data row that would give you an indication of a starting or an endpoint and even then if a missing February could either be 6+ or 7+ I guess we will have a hard time coming up with a solution.
KNIME can do all sorts of logics and tricks but if there is no consistent system (how strange it might seem) and no indication it will not be possible.
You might have to talk to your data source since representing important information in such a way carries severe risks of misinterpretation.
Some time ago I created a workflow which is running on a regular base on SAP data (xls export).
After some months I had the problem that the structure of the data changed in terms of columns (same data in other cols).
I solved that by renaming the columns after reading the xls data file.
To allow a type of configuration I’m using a (xls) config file. That file defines the columns to be renamed.
Using that approach I have no need to touch the workflow itself. It is operating on the renamed columns only.
In this case I would rename like
in_xls_file | in_knime_WF
January | month
December | month-1
November | month-2
I think you pivoted the values. You do not need it because you can use “the lag column” node.
This node picks value from a previous row and pull over the actual row.
Just take care about the order…your values should be ordered by date or by an index, hence I suggest to use a “sorter” node before lagging the column.
Once you did it use a math formula or a column expression to get delta.
I’ve anticipated a possible solution that covers all the mentioned bullets in your posts. As I interpreted, these are:
Customized reporting column names
Calculate differentials on a monthly basis (or any other customised granularity)
I’ve divided the workflow in two parts: The upper section brings some raw retail transaction data from the last past 12 months. And also a design of your desired report granularity, numbered in months to reporting date.
There’s a small intermediate section that cares about report sorting for rows and columns.
The lower section iterates over the report item: populating the missed, and calculating the differentials. The applied method to calculate differences is based in ‘Moving Aggregated’ node functions as anticipated in my previous post.
All relevant nodes are highlighted in shaded grey. The workflow ends with two reports allowing you to compare the results. If you inspect on LOREM item row, you will realize how Null Handling is achieved.