this question will almost certainly require an "official" answer from one of the KNIME core developers since they know all the gory details of the Joiner node.
Based on my (limited) knowledge, I can say that the Joiner itself splits large input tables into smaller chunks and also performs an internal sorting of these input chunks. AFAIK there is no detection of pre-sorted tables and the code from the Sorter node is used for this purpose, so I would not expect a benefit from your point (3) and also no big benefit from approach (2) - unless you have a very special distribution of keys that you can use somehow.
The runtime of string comparisons certainly depends on the length of the strings, so (4) might help - but I would not expect a dramatic effect.
I have seen significant speedups when I added a special handling for rows with missing values - however, those were tables that contained a significant number of missing values and it also was a couple of KNIME versions ago.
Just to mention it: If there is any way for you to make more memory available to KNIME - this would almost certainly help.