Solve assignment and transportation problem

Hello KNIMERs,

Is it possible to solve assignment and transportation
problem with KNIME/KNIME Optimization extension?

Could I formulate additional limitations beside the limitations of the standard assignment/transportation problem?

For example in case of assignment problem: I have 3 workers (A B C), they have to fulfil 3 different tasks (1 2 3), they can complete the tasks with different cycle time. The objective function is to minimize the total cycle time, what was spent on the tasks.
Additional limitation: Worker B is able to fulfil only task “2”

Thank you in advance!

Roland

Hi @rolandnemeth ,
This problem could be solved with the Python or R extension of KNIME. You could try setting up Python or R using the below links.

Python:

R: KNIME Interactive R Statistics Integration Installation Guide

Thanks,
Sanket

3 Likes

Hi @sanket_2012,

Thanks for your feedback.

I have successfully installed the Python extension and created a Conda environment.

Could you suggest a library, what I can use to formulate the mathematical problem?

Thank you in advance,
Roland

@rolandnemeth you might start researching possible libraries and maybe try a few things. Also you might try and utilise our latest AI friend in the task.

Depending on the nature of the assignment you might want to consider what your original input might be and you might also be required to be able to explain why you chose a certain method.

1 Like

@mlauber71, thanks for sharing the related article, I will check it.

@rolandnemeth

My apologies, I did not see this question earlier. Hope the following is helpful.

The KNIME Parameter Optimisation nodes are targetted on optimisation problems where algorithms generate an error/loss function which needs to be optimised. These algorithms typically generate a convex or concave manifold, so the optimiser is able to use gradient descent type algorithms to improve the efficiency of the optimisation problem. The node is not exclusively gradient descent style, it includes random sampling, but does indicate what it is trying to do. An example would be fitting a linear regression line where the parameters a, b need to be estimated for the function y = ax + b given as set of known data points (xi, yi).

From what you have described you have a combinatorics problem, which is selecting the best option from all the possible permutations of workers and tasks. This is a big topic which can be split in two: The brute force method, which is to determine all the permutations, calculate the result and rank them from best to worst; and the approximate method: if you have a particular problem there may be an algorithm which gets you close to the answer for much lower processing cost, however, may not guarantee the best answer (you will need to do some research for your specific problem to see if it has been addressed).

I’ve outlined the brute force method below using KNIME nodes. The workflow can be downloaded from the following embedded link.
Combinatorics 2013-01-30.knwf (53.1 KB)

Create permutations

The first task is to create all the permutations of workers and tasks. This is achieved by creating three tables, one for each task. Each table has one string column with the name of the task, and permissible workers listed in the rows of the column. So task one has a row with A and one with C; task two as rows with A, B, C.

The cross-joiner node creates combination rows taking each row from table one and each row from table two and joining them. So by applying two cross-joiners to each of the two tables we end up with one table with three columns (task_1, task_2, task_3) and all the permutations of workers in the rows.

I then created a label for each permutation and used it as the RowID. I then unpivoted the table so that there is a list of permutation, worker, task ready for calculating the hours spent processing the task by each worker.

Calculate each permutation

There are two tables. The first lists the workers and the rate at which they process items per hour; the second lists the tasks and the quantity of items to be processed. These are joined to the permutations table calculated earlier and a maths formula applied to calculate the amount of time each worker spends processing each task. The table is then grouped to calculate the amount of time that each worker is working in total per permutation (they may have been allocated more than one task per permutation).

Rank permutations

I didn’t know whether workers had to work in series (production line) or could work in parallel, so bifurcated the workflow (though both are similar in principle).

The workers in parallel calculates the amount of time required to complete all tasks by selecting the amount of time taken by longest working worker in each permutation (e.g. if A takes 20 hours and B takes 15 hours the permutation takes 20 hours). The workers in series just sums the amount of time taken by each worker.

The permutations are then sorted based upon the amount of time it will take to complete all tasks. The shorted permutation is selected (first row) and the definition of the permutation is added to the table from the permutations table calculated in the firs step.

This is a very basic example to explain workflow. However, once the principle is understood of creating permutations;, scoring the permutations, ranking and selecting the best, then it should be easy to extend the approach to more complex scenarios.

Hope that helps

DiaAzul
LinkedIn | Medium | GitHub

3 Likes

Hello @DiaAzul @sanket_2012 @mlauber71

Many thanks for your response.

In my opinion the approach of @DiaAzul fits to “small” assignment problem.

In my example, in my former post can be found this kind of problem, with only 3 workers and 3 different tasks. I just wanted to illustrate my problem.

In the fact, I would like to find a solution/algorythm for waresouse organisation, ca. 250 partnumbers have to be assigned roller channels.

The permutation of 250 store location is a very large dataset, I think my hardware is not able to handle this amount of data.

Simple constraints can be formulated easily in Python, as proposed by @sanket_2012 and @mlauber71

There are more locations as partnumber, thus the inbalanced problem can be convert to balanced.
Only one partnumber must be assigned to one storage location
Only one storage location must be assigned to one partnumber

But, somehow I have to formulate further constraints:

Because of the outer dimenson of the boxes not all of the partnumbers fit to each storage location, thus there are decision variables, which value has to be zero.

Number of boxes to be placed is known by partnumbers.

Because of the weight not all boxes can be placed in the lower and higher shelf levels for ergonomic reasons, thus there are further decision variables, what has to be zero.

Furthermore the lasting of the shelf levels inside rasters is limited (500kg), the weight of the boxes is known, the storage capacity of the locations/channels is known. The information is available about the location groups, which locations are inside the same raster on the same level.

Do you have any idea how these constraints can be formulated in Python?
Do you know any other solution/method for this problem?

Thank you for your response in advance!

Roland

1 Like

@rolandnemeth

I’m not very smart. So apologies if this is too simplistic.

You have 250 roller channels which from your description can be split into two/three groups (those too low, those too high, and those in the middle) on the basis of a parts weight; then you have another potential set of groups (small, medium, wide) that you can use to categorise the roller channels again. Therefore, you have nine potential sets of roller channels into which you put the channel numbers.

You then take the part numbers and match them to the sets of roller channels based upon the available criteria. Each part number is allocated then next roller channel number within the set until they are all allocated.

If a set of roller channels is exhausted then you need a strategy for allocating part numbers to the second most optimal roller channel set.

You don’t need to work out the combination of all roller channels only the abstract types that you define according to your constraints. If you try and turn this into a naive allocation problem then you are correct the problem will exceed the capabilities of your computer, far better to reformulate the problem into something more tractable.

DiaAzul
LinkedIn | Medium | GitHub

1 Like

@rolandnemeth you might have to think along the lines of such an optimization package like PuLP or Pyomo though I am not an expert. I tried to ‘motivate’ ChatGPT to create an example but it was not instantly able to compile a working example that would be good to understand and adapt.

You might want to research further in this direction.

https://www.coin-or.org/PuLP/CaseStudies/a_transportation_problem.html

1 Like

What I realised trying to set up this example is what other people also have noted is that ChatGPT is not a math processor - it compiles texts and if the text happens to have consistent example it is fine. Otherwise it would just print whatever numbers would look ‘good’ or might have come up in similar texts. Also there is no verification for code being provided. The code typically does look good but if it does work: that is up to you.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.