Excel Goal Seek function in KNIME

Hi everybody,

I’ve found some different posts on this topic already but unfortunately none of them helped me with my problem and maybe some of you pros can help me :).

I want to basically create the Excel Goal Seek function inside KNIME by maximizing the Revenue column and hitting 90% in the Percentage column by changing the amount of pieces sold (between the from and to values) for each row.

Is something like this possible?

Here are some sample data a excel file:
Example.xlsx (11.3 KB)

Best regards and thank you in advance!

hi @TimoW,

For Excel to perform a goal seek, the spreadsheet would need to have a formula in it so that varying one column will cause a recalculation of another.

Unfortunately your spreadsheet contains only static data with no formulas so we cannot see what the basis for your calculation would be that you need replicated in Knime.

What is the relationship between your data items and how is the percentage column calculated? Are you able to provide a spreadsheet that has the equivalent with a working goal seek?

I haven’t got an answer to how you’d achieve it but in order to think about possibilities, I think more information about your specific requirement would be needed, as I’m not sure that Knime would provide any kind of generic solution for this.

HI @takbb

thank you first of all for your fast response! Yeah not providing an example is not the best idea sorry…

Unfortunately I cannot upload an xlsm file here so you can use either this link to get the xlsm file:

or here is the regular xlsx:
TestMacroGoalSeek.xlsx (14.7 KB)

The following picture shows the set, changing and to value cell for the manual goal seek:

The set cell is calculating the Percentage * Volume for each row divided by the count of rows and i want to hit the to value by changing the Volume in the first row.

Thank you and best regards!

Well, it is an interesting challenge, and I’d say it’s not really something that Knime is designed for, as it becomes more of a programming challenge, than a data science one. There are really a couple of ways of writing a goal seeking algorithm that spring to mind. The first (simplest and less efficient, and the one I’ve had a stab at here) is to provide the system with a min value and a max value and then get it to iterate between the two, varying each time by a small increment (or decrement) until the final result is within the required tolerance of the required answer.

An alternative to that would be to perform an iteration loop by making a series of “guesses”. Based on each guess, the logic would be to move the next guess either higher or lower depending on whether the result of the current guess were moving toward or away from the required target compared with the previous guess. Typically each guess is made at a perceived “mid point” between the previous guess and the required direction of travel. This is generally more efficient and requires fewer tries to get to the required outcome, but the logic for that would be a little more involved. I suspect though that this is something akin to the mechanism used by Excel.

I have attempted to demonstrate a workflow which tries the first of these two, but I certainly wouldn’t recommend using it for anything other than demonstration purposes, or if you have time some sort of refinement. (Iterations of this type are expensive in terms of time to create, and time to process)

Once I finally had a solution, however, I struggled to make it fit the results of your spreadsheet. I finally realised that your spreadsheet’s goal seek divides by 6 instead of 5 (the actual number of data rows). My workflow counts the number of data rows rather than hardcoding the number and as a consequence you won’t get it to agree with the excel example you’ve given

The workflow is supplied with a “parameters” table (Table Creator) into which you need to enter the required target value, the minimum (current value), maximum, increment step, and tolerance from the required target. If you make the increment step too large, there is every chance an iteration will “jump over” the required answer because it will never be within tolerance, but if the step is too small, the iterations will take too long!

I would doubt that this workflow is likely to be the answer you are looking for in terms of achieving this with Knime. As you can see to handcraft this is not the “single click” that you’d have in Excel, and I’ve had to jump through a few hoops to make this work. It doesn’t necessarily mean though that such a task is completely out of the question, but I’d say it’s not really the type of job that a workflow solution is designed for.

1 Like

I think a simpler way to go is to use a Parameter Optimization loop:


@elsamuel, I really like your solution, but try as I might, I couldn’t get it to quite return the expected result, and I wonder whether it is just something in my environment.

The expected answer in Excel from the Goal Seek for the Volume in the first row was 5.57 ish, so depending on the step size something around 5.6. But on your workflow running in my knime 4.3.2 environment, it returns 5.022. I tried different step size but it didn’t make much difference.

So I tried to work out why this would be, and when I changed out the Column Expression Node for a combination of Row Splitter, Math formula and concatenation, (but keeping the logic of what your node was doing), the optimization loop came back with 5.62, which is much closer to the expected result.

I’ve attached a workflow containing both yours and the modified version. I wonder if it behaves that way in your environment too, or is it something quirky with my setup? (Or I might just be overlooking something)

Goal Seek Modified.knwf (45.9 KB)

Regardless, thanks for the much improved workflow which is much simpler and better suited to this task. I hadn’t used the Parameter Optimization Loop before (wish I’d seen it yesterday :wink: when I was struggling to change the value of a flow variable in a loop! It is definitely one to add to the armoury!


Thanks for pointing that out.

At the last minute, I’d modified the expression I used to identify the first row (using the Column Expressions node), and that broke the entire thing. It’s been corrected.

I’ve also decreased the step size for the parameter variation (to get closer to the “correct value”), and switched to a Bayseian search strategy to speed things up a bit. The output is now 5.578.


Hi @takbb and @elsamuel

thank you for your incredible help!!!

@takbb I think the difference in what we get from the workflow also comes from the random seed and method that is used in the Parameter optimization loop start. I changed it to Brute Force as a test and it returns the 5,6 :slight_smile:

@elsamuel s version also returns the expected values now :slight_smile:

As I said i can’t thank you enough guys!

Best regards


This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.