I have been trying to create a PMML output of my workflow and noticed that the plain Normalizer > Normalizer (Apply) sequence gived different output than Normalizer (PMML) > Normalizer Apply (PMML) while having the same settings.
Normalization method: Z-Score Normalization
Excluded columns: Unit number, time_in_cycles, RUL
Example of different normalized output:
Normalizer > Normalizer (Apply)
Mean (sensor_measurement_1) range is [0,2]
Normalizer (PMML) > Normalizer Apply (PMML)
Mean (sensor_measurement_1) range is [-24,-22]
I think this is a defect.
How could I sen you the files?
Could you advise?
I tried to reconstruct this but without any luck.
Could you make me a small workflow where you have exactly this problem?
Hi Iris – I’d like to send you the files. Could you advise, pls?
Any news on this topic?
Hi @ribizli -
Were you already able to send your files to Iris directly? If not, feel free to email them to me here: firstname.lastname@example.org. Thanks!
yes we did find the problem here.
It goes back to nearly constant value columns, which are wrongly calculated in the PMML node.
I will check back with our developers and let you back as soon as this is fixed.
Best wishes, Iris
In the meantime, do you have any suggestion for a workaround?
Any news on the Normalizer Apply (PMML) node bug fix? Or maybe some tips on a workaround?
I don’t have a workaround yet, but I can tell you what the problem is. Z-Score-Normalization in PMML is encoded like this (http://dmg.org/pmml/v4-2-1/Transformations.html):
<LinearNorm orig="0" norm="-m/s"/>
<LinearNorm orig="m" norm="0"/>
Reversing this to retrieve the scale that needs to be applied to the values, we have to divide the norm of the first LinearNorm element by the orig of the second LinearNorm element. Here we run into numerical precision problems that lead to the behavior you observed.
We will be further investigating this issue and try to find a solution that is hopefully within the PMML standard.