I have two tables. One table contains 15000 destinations with a total of 230 orders. Each order occurs any number of times, so that in the end there are 15000 destinations (orderlines). Then I have a second table with 15000 lines. It contains 40 articles and their withdrawal quantities. The articles occur several times. The most frequent article occurs 200 times and the rarest article only 3 times.
Now I want to join both tables so that no article appears twice in any order.
How can I realise this in knime?
Hi @MCBirne , to make it a little easier for people to visualise, could you maybe upload some demo data. It need only be a few lines in each table, but this will make it easier for people to see what the link is between the tables.
I find it also easy to conceptualise if you are about to give each table a name so we understand its purpose and how the tables may be related.
From your description, all I have at the moment is this, with no obvious link between the tables.
e.g.
order_line
id | destination | order_number |
---|---|---|
1 | A | ORD01 |
2 | B | ORD01 |
3 | C | ORD01 |
4 | X | ORD02 |
5 | Y | ORD02 |
6 | A | ORD02 |
article_withdrawals
id | article | withdrawal_qty |
---|---|---|
1 | ART01 | 10 |
2 | ART02 | 5 |
3 | ART03 | 8 |
4 | ART04 | 100 |
I could make some assumptions to fill in the gaps, but I may make totally the wrong assumptions
OK, here’s an example:
Table with orders:
RowID | Order_ID |
---|---|
1 | 1471 |
2 | 1471 |
3 | 1471 |
4 | 5086 |
5 | 5086 |
6 | 5086 |
7 | 5086 |
8 | 5086 |
9 | 5897 |
10 | 5897 |
11 | 5567 |
12 | 5567 |
13 | 5567 |
Table with articles:
RowID | acticle | quantity |
---|---|---|
1 | A1 | 5 |
2 | A1 | 7 |
3 | A1 | 8 |
4 | A1 | 9 |
5 | A2 | 1 |
6 | A2 | 5 |
7 | A2 | 8 |
8 | A3 | 7 |
9 | A3 | 9 |
10 | A3 | 3 |
11 | A4 | 8 |
12 | A4 | 1 |
13 | A5 | 7 |
Now I would like to randomly assign articles to each order, but no article should work more than once in an order.
There is no logical connection between the two tables, but they have the same number of rows.
It’s really hard to follow if theres no logical connection between both tables.
How’s the desired output of your to input tables? Could you describe the background this task?
Do you just want to randomly add articles to each order, with the constraint that for each order there should not be the same article twice?
If that is the case this might be an approach:
You iterate over each order id, get the number of rows. Then you start another group loop that iterates over each article group - you shuffle it to change order of rows randomly and then only keep top row. At the end of the loop you will have one article of each type. This is then shuffled randomly again before you filter the table for the same number of rows than your order table - after that you append the two tables and collect in an output port.
Example WF:
randomassignorders.knwf (96.0 KB)
Thank you for the suggested solution. I have tested it with a lot of data. I noticed that not all articles are distributed.
That’s why I only used it as a second step. I first distributed the items randomly with shuffle. Then I identified duplicates and selected new ones for them with the variant. As a result, I don’t have all the rows in there, but the error is smaller.
Many thanks for your support
That sounds like a good way to optimize - with lots of data loops slow down the process so handling the exceptions only this way is a neat approach.
If the above helped you get there please flag as solution so that it is tagged for anyone who comes across this looking for help on a similar problem
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.