Hi @Muath welcome to the KNIME Community
Yes you are correct that KNIME doesn’t see “?” as a missing value. Missing values in a KNIME table are depicted as a red “?” but in that case there is no actual data. In your case you do actually have physical “?” strings.
You therefore need to process each column and where a “?” is found, replace it with the missing value. This ought to be straightforward but there can be complications.
You could use String Manipulation Multi Column, which is the simplest one-node solution that I can think of, but the syntax isn’t immediately obvious:
toInt(
string(
$$CURRENTCOLUMN$$.equals("?")
?toNull("")
:$$CURRENTCOLUMN$$
)
)
Alternatively, I have a couple of components; one of which was written for exactly this job:
Note that because this uses regular expressions, you need to escape the “?” by placing a \
in front of it.
As I see @qqilihq has posted, you can use Rule Engine, especially if it is only one or two columns, but if you have a large number of columns this can become less practical, and it is a pity that KNIME still doesn’t have a multi-column version of the Rule Engine… which is why I wrote the second component :
This uses a slightly extended syntax similar to the regular Rule Engine, but with some multi-column support:
NOT $CURRENTCOLUMN$ = "?" => $CURRENTCOLUMN$
After either of the component solutions, you will need to modify the String to be an Integer which you could do by then passing the data to the Column Auto Type Cast node.
In the String Manipulation (multi column) code, it performs the toInt( ) conversion for you, but of course this presupposes that all of the columns you are trying to fix are supposed to be Integers. If not, leave the toInt() off, and then fix them as with the components using a call to Column Auto Type Cast node.
I’ve uploaded a workflow demonstrating the above here:
edit: apologies that I see that at the beginning I had repeated much of what @qqilihq had already said! It shows it must be true