I work for a software company that make scorecards for credit risk using logistic regression models.
We have a new feature that allows a scorecard to be exported in PMML format but i am struggling to use the PMML file to ‘score up’ a dataset - i was wondering whether anyone can help me?
Below is an example of the PMML file that our software produces (the example is for a logistic regression model with 1 characteristic in it called “BANK_TERM”:
<?xml version="1.0"?><PMML version="4.2" xmlns="http://www.dmg.org/PMML-4_2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Header copyright="Paragon Business Solutions 2016" description="Paragon Modeller Scorecard">
<Application name="Paragon Modeller" version="1.7.0.1"/>
</Header>
<DataDictionary numberOfFields="2">
<DataField name="GOOD" dataType="double" optype="continuous"/>
<DataField name="BANK_TERM" dataType="double" optype="continuous"/>
</DataDictionary>
<TransformationDictionary>
<DerivedField name="BANK_TERM_Grouped" dataType="string" optype="categorical">
<Discretize field="BANK_TERM" mapMissingTo="UnGrouped">
<DiscretizeBin binValue="UnGrouped">
<Interval closure="openOpen" rightMargin="0"/>
</DiscretizeBin>
<DiscretizeBin binValue="0">
<Interval closure="closedOpen" leftMargin="0" rightMargin="1"/>
</DiscretizeBin>
<DiscretizeBin binValue="1-100">
<Interval closure="closedOpen" leftMargin="1" rightMargin="101"/>
</DiscretizeBin>
<DiscretizeBin binValue="101-200">
<Interval closure="closedOpen" leftMargin="101" rightMargin="201"/>
</DiscretizeBin>
<DiscretizeBin binValue="201-311">
<Interval closure="closedOpen" leftMargin="201" rightMargin="400"/>
</DiscretizeBin>
<DiscretizeBin binValue="400-500">
<Interval closure="closedOpen" leftMargin="400" rightMargin="501"/>
</DiscretizeBin>
<DiscretizeBin binValue="501-4500">
<Interval closure="closedOpen" leftMargin="501"/>
</DiscretizeBin>
</Discretize>
</DerivedField>
</TransformationDictionary>
<RegressionModel modelName="Logistic WoE Model 12 (12/12/2019 10:00:18)" functionName="regression" algorithmName="stepwise least squares" targetFieldName="GOOD" normalizationMethod="logit">
<MiningSchema>
<MiningField name="GOOD" usageType="target" outliers="asIs" missingValueTreatment="asIs" invalidValueTreatment="asIs"/>
<MiningField name="BANK_TERM" outliers="asIs" missingValueTreatment="asIs" invalidValueTreatment="asIs"/>
</MiningSchema>
<Output>
<OutputField name="RawScore" optype="continuous" dataType="double" feature="predictedValue" targetField="GOOD"/>
<OutputField name="Score" optype="continuous" dataType="double" feature="predictedDisplayValue" targetField="GOOD">
<NormContinuous field="RawScore">
<LinearNorm orig="0" norm="100"/>
<LinearNorm orig="1" norm="157.7078016355585"/>
</NormContinuous>
</OutputField>
</Output>
<RegressionTable intercept="0">
<CategoricalPredictor name="BANK_TERM_Grouped" value="UnGrouped" coefficient="0"/>
<CategoricalPredictor name="BANK_TERM_Grouped" value="0" coefficient="-0.455779235502154"/>
<CategoricalPredictor name="BANK_TERM_Grouped" value="1-100" coefficient="-0.000006210746542"/>
<CategoricalPredictor name="BANK_TERM_Grouped" value="101-200" coefficient="0.099084236747815"/>
<CategoricalPredictor name="BANK_TERM_Grouped" value="201-311" coefficient="0.775581076537226"/>
<CategoricalPredictor name="BANK_TERM_Grouped" value="400-500" coefficient="0.72822894463888"/>
<CategoricalPredictor name="BANK_TERM_Grouped" value="501-4500" coefficient="1.731641380474746"/>
</RegressionTable>
</RegressionModel>
When setting this as a “JPMML Regression Predictor” in KNIME, i am able to extract the Predicted Probabilty (Probabilty (good)) using an Interactive table but have no idea how to extract the two “outputfields” from the file (RawScore and Score)
Can anyone help with this?