"vlookup" return interpolated values if no exact match?

takbb · October 11, 2023, 11:20am

Hi @b_mitchell , ok I found the cause of the failure… it was between my keyboard and my chair! Apologies. If you update the component and/or re-download the demo workflow, it should now give you the expected results.

At some point I had managed to re-wire the value lookup so that it was returning the A value for both lookups, irrespective of the chosen “B” column. Either a moment of madness or a mis-click somewhere, and this is the actual lookup it was performing:

A	B	Value
10.0	10.0	50.0
20.0	20.0	55.0
15.0	15.0	52.5
30.0	30.0	60.0
42.0	42.0	65.0
60.0	60.0	65.0
45.0	45.0	65.0
38.0	38.0	62.714285714285715
9.0	9.0	50.0
1.0	1.0	35.0

b_mitchell · October 11, 2023, 11:31am

Perfecto! Thanks @takbb …
… again!

takbb · October 11, 2023, 11:33am

No probs @b_mitchell . By the way are you in the uk? I spotted what I assume to be a fellow Brit because you put the (correct!) “s” on the end of “Maths” in an earlier post

noddy101 · October 11, 2023, 12:27pm

Hi,
I’ve shared a JAVA Snippet to perform 1 & 2D interpolation with @b_mitchell (I work with him) using an external JAVA library. It took a while for me to work out the syntax but it’s super simple once I had.

The problem with this is that I have to contain the ‘base’ map (the one being used to look-up new values) in the JAVA code, I can’t ‘read’ in the axes or data as an array, as the JAVA snippet works on a row basis.
Is there a way to do this?

Here is the example workflow I used:
One & Two Dimensional interpolation example.knwf (12.9 KB)

iCFO · October 11, 2023, 12:46pm

For the sake of argument, a more accurate test would be to look for subtle Monty Python references and quotes. However, even this formal approach is not 100% accurate. In case you think that you have identified me as a fellow UK resident…

“Not necessarily, I could be arguing in my spare time…”

takbb · October 11, 2023, 1:15pm

lol, no I’ve already got you categorised… (with an “s”) @iCFO

b_mitchell · October 11, 2023, 1:28pm

I certainly am! In Kent, weather is overcast

takbb · October 11, 2023, 1:47pm

@b_mitchell Snap! Just… I’m in the London Borough of Bromley but I still put Kent on my address!

takbb · October 11, 2023, 3:34pm

Just for you @iCFO …

iCFO · October 11, 2023, 4:30pm

Well played! I had to open the component to see if you used an actual “Python Argument” in the coding of your Python Argument component that launched video of the Python Argument… A missed pun opportunity there, but you still stuck a solid landing.

It would have been very classic Monty Python to require the user to have setup a Python integration / install a Python module in their local environment to launch web browsers in order to eventually view a short joke.

takbb · October 11, 2023, 4:38pm

No it wasn’t…

Better stop now or we’ll get reported for Spamming. Oh the irony! lol

aworker · October 11, 2023, 4:56pm

Hi @b_mitchell

I believe that after so much work done by Brian @takbb to help you, he really deserves to get the solution he provided you as the one ticked as “the solution”

This is usually good practise as explained by @takbb himself in the following topic:

Besides this reason, it also helps other KNIMErs to easily and directly find the right solution at the beginning of the topic when searching for it

Hope this helps
Best
Ael

takbb · October 11, 2023, 5:13pm

Very kind @aworker, although I think actually at one point (around post #22) @b_mitchell had indeed marked it as the solution… and then things evolved a little , lol

takbb · October 12, 2023, 5:41am

Hi @noddy101. Sorry I didn’t spot your additional note. Welcome also to the KNIME forum!

This would be a struggle with the java snippet, which as you say is limited to row-at-a-time processing. Potentially if you could “serialize” the “lookup table” into one long String (e.g. using a couple of GroupBy/Aggregator type nodes , concatenating the rows and columns in a way that could then be “unpacked” within the java snippet and turned into the array, then it could possibly work.

A further complication is the inclusion of the additional jars. If you can locate the jars in a common location so that everybody who uses the workflow is be able to access them via the exact same path then it would work, but if not, it can become a struggle. (e.g. you could have it so everybody installs the jars locally in the same folder, or on a standard local network share)

You could even put the jars in the data folder within the workflow. However, this is the sort of thing you might want to package up as a component, and unfortunately a component doesn’t take any additional folders with it, so your jars “stay behind” when you try sharing it, which is a little frustrating!

So all in all a bit problematic with enough hurdles that I’d say java snippet isn’t a good fit for this scenario, even though it sounds like it would be a useful library to use.

A better option in this case would probably be python, which does have access to the entire table or tables supplied to it as dataframes. If you can find a similar library for python, that to me would be a much better fit here.

If you want to explore any of this further and have other questions about it, it would probably be best now to re-ask this as a new forum question, as this thread has got quite long and probably now reached its natural end. By all means reference back to this thread. thanks

noddy101 · October 13, 2023, 10:09am

Thanks for the quick reply @takbb.

I haven’t used Python (hardly at all) and not in Knime at all. I was using the JAVA Snippet quite a lot for other actions within Knime, so it wasn’t much of a leap to get the jar libraries working.

I’ve managed to get 1D interpolation working in Python, it took a bit of trial and error as I’m not familiar with it.
In the end it was quite simple, the 1D routine is contained within the numpy library.

2D interpolation looks a little more difficult as the I need to rearrange the Pandas DataFrame for the interpolation routine, which I don’t know how to do.

The 2D routine (LinearNDInterpolator or griddata) is contained in the scipy library, I think it needs to be these ones as they work on an irregular grids.

Anyway, here is the 1D example.
KNIME_python_1D_interp.knwf (11.9 KB)

My concern about this was that the whole table gets loaded (but only one column of new x values is required). I guess Knime deals with large datasets and the Python node can also?

It would be better to get the user (or flow variable) to select and load only the required column into the Python node. I don’t know how to do that.

takbb · October 16, 2023, 2:04pm

Hi @noddy101 , what you have done with the workflow looks fine to me, so wasn’t quite sure what your concern was. Do you mean if the “pseudo data” table in your workflow contained other additional columns? You could always add a column filter between the Table Creator and the Python Script if that is the issue, to restrict the data sent to the Python Script to just that column.

KNIME should be capable of handling some reasonably large data sets, but performance is heavily dependant on the amount of memory available. Sorry, I’ve not used the interpolation routines in scipy, but if you need help on getting those working, it’s possible somebody else on the forum has used them, so I’d recommend starting a new thread. People will see this thread as marked “resolved” so are less likely to read this.

system · October 23, 2023, 2:04pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.