You are asking a lot of very specific questions concerning the use of predictive methods with KNIME. I will include a few links to past discussions that also contain links to articles about evaluation metrics and example workflows.
Your questions suggest to me that you might benefit from making yourself familiar with some basic concepts of predictive model building. For example with this e learning course
Validate your knowledge and skills with the KNIME Certification Program. Complete the certification for your chosen learning path, and become a KNIME-certified data analyst, data engineer, or data scientist.
And if you want to get deeper into it there is a full blown Udemy course about KNIME that also covers Data Mining:
https://www.udemy.com/knime-bootcamp/
And then of course there is the KNIME forum to help you with further questions
Models for 0/1 or Yes/No Targets:
One ‘brutal’ way could be to convert the decision tree rules into SQL and then use the SQL code. Or use the ruleset to make some kind of If then rules. Not very elegant I have to admit.
[27]
[35]
kn_example_decision_tree.knwf (510.8 KB)
Understand metrics like AUC and Gini (and use H2O.ai)
I attached a working example of a Random Forest with H2O.ai nodes and a 0/1 Target from a Kaggle competition (please note the example only uses the first 1.000 lines, you should remove this restriction or use your own data for your own purpouses).
In order to get an idea about the quality of you model you might want to read this article:
I always advise not just to use a Scorer but rather a metric like Gini/AUC to see the relative quality of the model. In the end the ‘score’ is not a magic…
I think you could do two things. You could treat your outlier group as label/target (1/0 or TRUE/FALSE) and the rest of the data as explaining variables. You would have to remove the time column since this ‘leaks’ the information you want to find. You could then use an algorithm like Random Forest Learner *1) that also gives you a list of the most important variables, the ones that make the outliers the outlier compared to the regular cases.
You could also just use the time for each transaction…
2 Likes