Java Snippet that assigns value based on values in other rows


I would like to design a Java snippet that assigns a value based on values in other rows.

The specific use case is I have a sales database that has subscription pricing data of different products and the sub-modules contained in that product. I want to measure the price increase across consecutive years if the customer renews their subscription. It gets complicated when the customer buys an extra module for the same product, because it will calculate the price increase only for the existing modules, but count the other modules as 'new' and won't factor them into the price increase.

In the example below for a given customer, product B would count as a 10% price increase on the previous year. Product C would count as a new sale. Product A as I have currently set up the database counts as a 10% price increase, but does not factor in the two other new module sales (it is actually a 40% price increase for this product).

The Java snippet would be capable of producing the 'Renewal?' column I have added where it would be able to assign 'Renewal' to all modules of Product A, even if the individual module sales themselves were new sales. This would require the ability to dynamically look to previous rows, and see if modules for the same product invoiced on the same day had been renewed. I do not currently know how to do this, appreciate your assistance!

(Perhaps there are alternative solutions and feel free to suggest something.. however I would favor an answer in this specific framework provided, as due to the 'messiness' of the sales data, any other solution I have thought of so far has not been feasible)

ProdID ModID Inv_date Sub Yr $ Prev Yr $ Renewal?
A 1 01/06/2016 2016 11k 2015 10k Renewal
A 2 01/06/2016 2016 1k     Renewal
A 3 01/06/2016 2016 2k     Renewal
B 1 01/11/2016 2016 11k 2015 10k Renewal
C 1 01/06/2016 2016 10k     New


a Java Snippet cannot look through the rows, unless you use a global array to keep track of the totals as you encounter them. This may be cumbersome depending on your data and their overall size.

I would rather take a more "KNIME'ed" approach:

  • Use the GroupBy node to group your entries by Product and Date, summing up the subscription fees for the current year and for the previous year. This will generate one row for each Product/Date combination.
  • Replace the missing value for those Product / Date combinations which do not have prev year fee with a 0. This will allow use to track them in the next Math Formula node.
  • Use a Math Formula node with an if( ) function to calculate the price increase. This applies only to those rows that have a non zero fee for the previous year.
  • Use a Rule Engine node to assign a type of contract "Renewal" when a price increase can be calculated, or "New Contract" if it cannot.

See the attached workflow which implements the steps described above.


Hi Marco,

Thanks for your answer, this already helps. However I should have given more detail in my original post.

What I would like to account for is scenarios like I've given below. The customer can subscribe to new modules whenever it feels like - below they have subscribed to module 4 of product A three months later. Your analysis would count this as a 'New Contract', whereas in reality this should be counted as a 'Price Increase' for their existing subscription, as they are already subscribed to Product A. Do you have any suggestions on how to account for this?

ProdID ModID Inv_date Sub Yr $ Prev Yr $
A 1 01/06/2016 2016 11k 2015 10k
A 2 01/06/2016 2016 1k    
A 3 01/06/2016 2016 2k    
B 1 01/11/2016 2016 11k 2015 10k
C 1 01/06/2016 2016 10k    
A 4 01/09/2016 2016 5k    







Hi Hami,

I understood from your previous message that dates were also relevant to determine Price Increase vs. New Contract. If not, you can simply remove the date as a grouping column in the GroupBy node and the grouping will only happen on the product (A, B, C in your case) and no longer on product + date. The rest still applies.

I imagine you will also have some sort of Customer ID in your columns. That should also be a grouping factor to avoid mixing up different customers.