I have a table with 1.4M rows. Three columns in the table are "code values" that must be translated into a description by joining to their corresponding ookup table. The first two columns are customer ID strings that are generally between 10-15 characters long made up entirely up numers despite the fact their data type is string. Both of these customer lookups join to a customer table with 31,000 rows in it. In bother cases the join node is very fast running in less than a minute.
The third lookup node is my issue. It mastches a customer group column in the source table that is generally between 15-20 characters of mix case alpha-numeric values to a lookup table that contains only 22 rows. I would have expecetd this one to significantly outperform the fist two nodes but its the exact oposite. This join takes 3-4 hours to complete.
I have done everything I can to ensure the data types are correct like stripping off trailing spaces, converting both to upper case, etc. Nothing seems to improve the performance of the join. I tried changing memory settings to use disk, use memory and only use small tables in memory. I get the same results.
Is there anything you could think of to tune where the smaller join would perform as well as the faster ones?