I used a Logistic Regression Learner (Gauss setting) to address over fitting,
I have an accuracy of 94.3%. But still when I deployed the model to an unlabeled data set, it still cannot detect the NTF failures.
Usually you have to apply the same preprocessing steps to your training and your test data. The exception being methods that change the composition of your training set, like oversampling / stratified sampling etc. Speaking of such methods, are you using Stratified sampling in your Partitioning node? This can help if your classes are not evenly distributed (e.g. one is much more rare than the others).
I believe this could be a normalization issue.
You should calculate the normalization model on your training data after you split it with the Partitioning node.
If you use your model to predict unseen data, you need to normalize this data with the model produced by the normalizer and the Normalizer (Apply) node.
It’s extremely important to use the normalization model to normalize unseen data because the new data could have a different distribution (e.g. unbalanced classes) which will result in a different normalization.