Normalizer Apply (PMML) creates different table than Normalizer (Apply)

Hi,

I have been trying to create a PMML output of my workflow and noticed that the plain Normalizer > Normalizer (Apply) sequence gived different output than Normalizer (PMML) > Normalizer Apply (PMML) while having the same settings.

Normalization method: Z-Score Normalization

Excluded columns: Unit number, time_in_cycles, RUL

Example of different normalized output:

  1. Normalizer > Normalizer (Apply)
    Mean (sensor_measurement_1) range is [0,2]

  2. Normalizer (PMML) > Normalizer Apply (PMML)
    Mean (sensor_measurement_1) range is [-24,-22]

I think this is a defect.
How could I sen you the files?
Could you advise?

Thanks,
ribizli

Hi @ribizli

I tried to reconstruct this but without any luck.

Could you make me a small workflow where you have exactly this problem?

Thank you!
Iris

Hi Iris – I’d like to send you the files. Could you advise, pls?

Thank you!

Hello there,
Any news on this topic?
Thanks,
ribizli

Hi @ribizli -

Were you already able to send your files to Iris directly? If not, feel free to email them to me here: scott.fincher@knime.com. Thanks!

Hi Both,

yes we did find the problem here.
It goes back to nearly constant value columns, which are wrongly calculated in the PMML node.

I will check back with our developers and let you back as soon as this is fixed.
Best wishes, Iris

1 Like

Hi Iris!
In the meantime, do you have any suggestion for a workaround?
Thanks!

Hi there,
Any news on the Normalizer Apply (PMML) node bug fix? Or maybe some tips on a workaround?
Thanks,
ribizli

Hi,
I don’t have a workaround yet, but I can tell you what the problem is. Z-Score-Normalization in PMML is encoded like this (http://dmg.org/pmml/v4-2-1/Transformations.html):

<NormContinuous field="X">
    <LinearNorm orig="0" norm="-m/s"/>
    <LinearNorm orig="m" norm="0"/>
</NormContinuous>

Reversing this to retrieve the scale that needs to be applied to the values, we have to divide the norm of the first LinearNorm element by the orig of the second LinearNorm element. Here we run into numerical precision problems that lead to the behavior you observed.
We will be further investigating this issue and try to find a solution that is hopefully within the PMML standard.
Kind regards
Alexander

1 Like