Need to write an node to deal with the output of decision tree learner

Hi Dear All:
I am a new user here, and i need to write an operator to:
compare 2 decision trees generated in learning, and output the difference between these two trees.

i have several questions:
(1) to start, i want to know what is the format of the output of a decision tree learner? How i can see it? I realize this may be a naive question, so may be at least someone can tell me where i can start(doc, ect?). Thanks a lot.
(2) So i have my code for compare two tress already. But the program should work as this: so i have 1000 rows of data, I am going to generate 500 decision tress, (use 2 rows in each training), and then my operator is supposed to compare 1st and 2nd tree, 2nd and 3rd tree, etc…). I am wondering if now the decision tree learner node can do a loop - like generate 20 trees at one time? if not, is there a way to work this around? Or do I need to also modify the learner node?
Guys, thanks a lot for your help and guide. If this is too much please let me know where should I start:)

Thank you!
best
Donna

Hi Donna,

(1) The output of the Decision Tree Learner node is a PMML model that can be programmatically accessed as decribed here. You also might want to check the PMMLDecisionTreeHandler class that gives some more insights into the underlying decision tree PMML.

(2) Yes, any node in KNIME can be embedded into a loop. This feature is available as soon as the expert mode has been enabled in the knime.ini file. To learn more about looping and variables in KNIME, I would suggest downloading workflows from our public server available as additional KNIME view or check-out the meta-node category that contains a number of pre-configured workflow snippets for cross-validation and feature-elimination.

However, we currently don't have a node that compares the decicion tree models directly, but it would be nice to have one :-)

Best regards, Thomas

Dear Tomas:

Thanks a lot! I will start trying it. I had the code to compare tree already. I will let you know how things go.

 

donna

Dear Tomas,

I can not include use

 

DecTreePredictorNodeModel as input in the execute method. It said "

The method execute(DecTreePredictorNodeModel, ExecutionContext) of type EEATTreeDiffNodeModel must override or implement a supertype method". I guess my question is that, I know once I get the tree, i can use java api to manipulate the tree. But I do not know how I can read it into my method. If you can help me with that, i will really appreciate! Thanks! Donna
   

Hi Donna,

When extending MyNodeModel from the abstract NodeModel class and are not only using data ports (decision tree PMML ports for example), you need to implement #configure and #execute methods as shown below. I think your NodeModel implementation looks like this (copied from DecTreePredictorNodeModel):

public class MyNodeModel extends NodeModel

public MyNodeModel()
{
super(new PortType[]{PMMLDecisionTreePortObject.TYPE}, …);
}

    protected PortObjectSpec[] configure(final PortObjectSpec[] inSpecs)
            throws InvalidSettingsException
{
        PMMLPortObjectSpec treeSpec = (PMMLPortObjectSpec)inSpecs[0];

}

protected PortObject[] <strong>execute</strong>(final PortObject[] inPorts,

            final ExecutionContext exec) throws CanceledExecutionException, Exception
{
        DecisionTree decTree = ((PMMLDecisionTreePortObject) inPorts[0]).getTree();

}
}