Hey KNIME USERS
For a study course I need to present the software KNIME and I want to show three little examples from our lectures. But unfortunately we are not able keep them running.
We want to show an
1.association analysis (Settings: lowerboundMin Support: 0,25 and metric Type Confidence with 0.60)
2. cluster analysis
3. classification model (a decision tree) German and Englisch excel sheet
In the Appendix you can see the excel files and the solution on the pictures.
Now the question, which NODES we need in every example, I just tried out a lot but can find the right solution.
Thanks in advance for your help. If more information necessary, just let me know.
Kind regards
Philipp
Now the question, which NODES we need in every example, I just tried out a lot but can find the right solution.
Really? It's not that hard. The search terms "association rules", "cluster", and "classification" entered into the Node Repository search input field will get you staight to the correspondig nodes.
Philipp
I was able to find the Nodes. For the association analysis is used the following Nodes:
XLS reader (for the excel file) => Column Filter => Apriori (to set the settings to generate the rules)
But at the End the result is not the same, like we had in our lecture. The items are not counting right. It looks that the Apriori Node counts only the items in specific columns / rows but not from the whole table. Don't know why.
WIll I need more Nodes?
I tried to keep as simple as possible that I'm and also my colleagues can understand it when I will present it to them.
For the association rule learner, you'll need to input your transactions through one column, either a collection column, or a bit vector. To create columns of these types, there are two dedicated nodes available (Create Bit Vector, Create Collection Column).
Thank you for your help. This was the Node that I need to add in my workflow.
But one last question to the association Analysis. How can a write the whole data base into my excel sheet.
Always the items are missing as you can see on the screenshots. You maybe know how can I transfer them also to the excel sheet?
Thanks in advance!
And to the decision tree:
I used the excel file in the appendix and want to create a decision tree that should look like this:
First divided by the AGE, than for the persons under 30 divided if they are students or not and for the persons older than 40 divided by the quality level. All persons between 30 and 40 are buying a computer so there a no deeper split necessary / possibly.
I used the workflow you can see in the appendix. But my decision tree includes only the age (appendix). But no further differentiation / split. Don’t know why? Do you have an explanation for that? Is there something totaly wrong in my workflow or maybe missing?
Settings for the node
Partitioning: I used the Relative with 100% (just for these examples, I know that I will use with these settings all the data for the trainings data) and select Stratified sampling for “buying computer”
Color Manager I select the column buying computer (two possible answers “eyes (blue) or no (red)”
Decision Tree Learner (see appendix
You can't have collection cells in Excel. Therefore you need to convert the collection into something that Excel can handle, like String.
EDIT: For the decision tree: read the Node Description or at least have a look at all the dialog components and you will definitely figure it out yourself.
Ok I guess it was just only the Setting for the "Min number of records per node". Now I can see the decision tree as I want. Thank you also for your support!
Its me again.
Now I have a data table with Information about 1000 Person with following information
Alter /age (under 25, 25-50 and older 50)
Art der Mitgliedschaft / Status of Membership (Premium or Normal)
Laenge der Mitgliedschaft / Duration of member ship (short, middle, long)
Geschlecht / Gender: male or female
And the Stratified sampling Attribute is: Kauft jemand ein Produkt / If someone is buying a product (Yes / No)
But The Decision tree looks weird. Sometimes he divided the tree after the Age into the Gender and sometimes not. Don’t know why? Attached you can find the Result of the decision tree and the Settings of the Deceision Tree learner.
Can someone help?
Thanks in advance