For example, for product A in Tables 1 and 2, I compare the first row and find that 2>1. Then I split the first row in Table 1 into 2 rows with quantity 1 and 1. This continues for all rows for Product A and then we go to product B.
The outcome is that the total quantities are still the same, but rows are split so that Table1 and Table 2 have matching quantities.
This is possible with KNIME but afraid it might get to complicated. KNIME mostly performs data manipulation row-wise and this requires different approach for which standard nodes don’t work. Especially when data comes from multiple tables. Scripting (R/Python/Java) is much more suited for this kind of tasks. If you want to stay on this one with KNIME (whether you are looking to further manipulate/visualize your data after this task or for some other projects so you are keen on learning KNIME) you can still do it within KNIME as above mentioned languages are integrated with platform. But don’t see this as a good start-learning-KNIME task/project.
What about using the Custom Join Condition component by @takbb to join the 2 tables via your rules, then the reference row filter to get the left and right rows that didn’t match. This could be done on the side from the main workflow. You could then use the custom join output to isolate target rows on each table branch using the reference row splitter and apply the necessary ETL rules on each table separately. After that you should just be able to concatenate them back and sort them back into the correct order.
Since this would require a row level join, you would have to apply the counter generation node to each table first, and make that column a matching join condition. This would also serve as your sort reference to re-build your table order.
Instead of the Custom Join Condition component part - You could also just append the 2 tables together for the side calculation, test values via formulas, then filter rows that don’t require change, then use the column splitter to segregate 2 tables again. This could also serve as your reference rows to isolate which rows require transformation.l in each table.
Glad you have it working. If you end up needing to process larger datasets with hundreds or thousands of rows, then you may want to try the join approach. The workflow would execute considerably faster than a row by row iterative loop approach. For small tables like your example iterative looping will work great.
Great, Iteration over rows is build in, but the joiner on the product level loop was joined on the old RowID. That’s something I could only experience with a larger product list.