Remove duplicates and keep most recent date

Hi all;

I’m looking for some help with something I need to implement on Knime.

Statement:
I have raw data that is downloaded from SAP. This raw data will be managed later on by Knime to provide the final output that I need.

Problem:
During the data download part of the information I need is the price of the product codes. Currently in SAP we might find more than 1 price for each code, the difference is the valid date for each price.
This means that my raw data is downloading duplicates (the same code with all the prices available in SAP).

***In Knime I’ll need to remove duplicates and leave ONLY those which price valid date is the most recent one.

Any ideas on how to achieve this?

Use a GroupBy node with the product code as your grouping column, and in the aggregations tab use the maximum for the date. That will give you a list of product codes with the latest date. Now use a Joiner, and use an inner join, and join this table with your original table, joining on both of the columns from the groupby table

Steve

2 Likes

Thanks! this just saved my day!

1 Like

Hey @chaconq,

congrats the community could solve the problem! Please be so kind and mark @s.roughley answers as correct so any one can see easily the topic is closed succesfully.

Kind regards,

Patrick

2 Likes

I’m new here, thanks for the follow-up on that. I’ve marked the answer as correct.

:+1:t3:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.