Which node do I use to simulate the VLOOKUP with wildcard on KNIME i.e., VLOOKUP(A2&“*”,Sheet2!A2:A100,1,0)
I tried adding the wildcard (*) to all the values in the column using string manipulation and then use Joiner node, but that did not help…
the topics which are present in this forum related to this scenario were not helpful either… I mean the solutions of those topics got entirely deviated, because of the requirement…
my requirement is pretty much simple vlookup with wildcard.
Hi,
Unfortunately, KNIME’s Joiner node can only join two tables based on a full match, not a partial match. What you’d need to do is use a Cross Joiner and then a Rule-based Row Filter with the LIKE operator.
Kind regards,
Alexander
Hi @Asghar , depending on exactly how you want to do the lookup, if you are using KNIME 5.1 and later, the new Value Lookup node may help as it allows you to lookup with wildcards, but the option has been hidden away.
Click the Show advanced settings link at the bottom of the configuration panel
If that doesn’t serve your purpose, then you may be interested in my “joiner components”, demonstrated here:
and available from here:
If you have small datasets, the following may be a good fit for you:
If you have larger datasets, and are comfortable using sql-like syntax, the following is the fastest of these components:
There are some caveats, as internally they use they use an H2 database (you don’t need to do anything as that is already available to you) but it can give rise to some limitations in their use depending on your data types, but it may be of help
@takbb - actually I tried using the Value Lookup node too, after trying on the Joiner node. Substring is giving me additional results which I do not want and Wildcard did not give any results.
my lookup value looks something like this with wildcard
however it did not give any results… when the other table has the following
mkarchitectsinc
peratonincorporation
will try the suggestion provided by @AlexanderFillbrunn. since my data is few thousand rows… the cross joiner will probably work…
@Ashgar, yes it is true that Value Lookup requires that the wildcard or substring is in the “dictionary” data attached to the lower port.
For some use cases simply switching the port connections can yield the results you need.
i.e. if you attached the
mkarchitectsinc
peratonincorporation
to the top port,
it should match to
mkarchitects*
peratonic*
on the lower port.
This does of course mean that it won’t necessarily return the full set of rows (because it will return the full set of rows from the upper port, and only “chosen match” from the lower) so it really does depend on your use case and your data if that is acceptable!
@takbb I first thought about your extended joiner component. Why do you recommend only small datasets? Is there some slow node icorporated? Would expect that h2 databases were used
br
Yes @Daniel_Weikert , I do use H2 database for the joiners, and I’m a little vague about what constitutes a “large” or “small” data set, but the only optimised joiner I have which works better for larger datasets is the last one I posted the “Join Custom Condition -indexed” component, because I added extra code in that to create indexes in the H2 database for the columns used in the join.
Details on the difference this made can be found here: