VLOOKUP with Wildcard

Hi,

Which node do I use to simulate the VLOOKUP with wildcard on KNIME i.e., VLOOKUP(A2&“*”,Sheet2!A2:A100,1,0)

I tried adding the wildcard (*) to all the values in the column using string manipulation and then use Joiner node, but that did not help…

the topics which are present in this forum related to this scenario were not helpful either… I mean the solutions of those topics got entirely deviated, because of the requirement…
my requirement is pretty much simple vlookup with wildcard.

Hi,
Unfortunately, KNIME’s Joiner node can only join two tables based on a full match, not a partial match. What you’d need to do is use a Cross Joiner and then a Rule-based Row Filter with the LIKE operator.
Kind regards,
Alexander

1 Like

Hi @Asghar , depending on exactly how you want to do the lookup, if you are using KNIME 5.1 and later, the new Value Lookup node may help as it allows you to lookup with wildcards, but the option has been hidden away.

Click the Show advanced settings link at the bottom of the configuration panel

This then reveals the following options.

I have no idea why the decision was made to hide what is probably the most sought-after addition to the lookup functionality!
You can see it demonstrated in this webinar video
https://www.knime.com/events/automate-away-spreadsheet-tasks-trusted-knime-community-contributor-brian-bates
(fast forward to 33 minutes for the Value Lookup, and then at around 35-36 minutes for an example of Value Lookup with wildcard.)

If that doesn’t serve your purpose, then you may be interested in my “joiner components”, demonstrated here:

and available from here:

If you have small datasets, the following may be a good fit for you:

If you have larger datasets, and are comfortable using sql-like syntax, the following is the fastest of these components:

There are some caveats, as internally they use they use an H2 database (you don’t need to do anything as that is already available to you) but it can give rise to some limitations in their use depending on your data types, but it may be of help

2 Likes

Thank you @AlexanderFillbrunn @takbb for the quick response.

@takbb - actually I tried using the Value Lookup node too, after trying on the Joiner node. Substring is giving me additional results which I do not want and Wildcard did not give any results.

my lookup value looks something like this with wildcard
image

however it did not give any results… when the other table has the following

mkarchitectsinc
peratonincorporation

will try the suggestion provided by @AlexanderFillbrunn. since my data is few thousand rows… the cross joiner will probably work…

1 Like

@Ashgar, yes it is true that Value Lookup requires that the wildcard or substring is in the “dictionary” data attached to the lower port.

For some use cases simply switching the port connections can yield the results you need.

i.e. if you attached the

mkarchitectsinc
peratonincorporation

to the top port,

it should match to
mkarchitects*
peratonic*

on the lower port.

This does of course mean that it won’t necessarily return the full set of rows (because it will return the full set of rows from the upper port, and only “chosen match” from the lower) so it really does depend on your use case and your data if that is acceptable!

@takbb I first thought about your extended joiner component. Why do you recommend only small datasets? Is there some slow node icorporated? Would expect that h2 databases were used
br

1 Like

@takbb - it did work after switching the ports,
but this -

mkarchitectsinc
peratonincorporation

will always be part of the dictionary

therefore, switching the port is not acceptable to the use case I’m working on.

thank you for clarifying on the Value Lookup though, it is useful for my other task.

Yes @Daniel_Weikert , I do use H2 database for the joiners, and I’m a little vague about what constitutes a “large” or “small” data set, but the only optimised joiner I have which works better for larger datasets is the last one I posted the “Join Custom Condition -indexed” component, because I added extra code in that to create indexes in the H2 database for the columns used in the join.

Details on the difference this made can be found here:

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.