This thread is for posting solutions to “Just KNIME It!” Challenge 21.
Here is the challenge: Just KNIME It! | KNIME
Feel free to link your solution from KNIME Hub as well!
Have an idea for a challenge? We’d love to hear it! Please write it here .
And remember: the more you participate, the more participation badges you may end up getting. Fancy, huh? Just remember to correctly mark your solution in the Hub with tag justknimeit-21.
Here is my solution.
Here is my solution. It’s been a long time since I have done any stats. Hopefully I did that right.
Hi guys, here’s my solution.
Here’s my solution
I calculated the absolute spend to compensate for the switch columns (resulting in negative spend)
here is my solution to this challenge:
I decided to consider negative costs errors and to filter them out.
As regards correlation, I found the following:
is my solution for challenge 21.
It can be seen that both in terms of costs and times, participantID 612 and 603 are leading.
Hello KNIMErs, Here is my solution for Challenge 21
My take on Challenge 21- Eating out…
Simple approach… basic nodes
Here’s my take:
My findings were as follows:
- The top 10 are actually a top 12 because if we take the money spent as ranking criteria (and also consider the time spent) we have multiple #1s #6s and #10s. For the #10s there is - as far as I can see - no other way to determine any other criteria that would qualify any one of 3 #10s to be the one to be included in the list, hence it is only “fair” to include all of them
Lessons learned for me: This would be something I would discuss with the stakeholder when I come up with a first solution → is there any other criteria that could limit this list to a “true” top 10?
- Participants who spend most money are NOT equal to the ones spending the most time. I determined this from the fact that #3 in money spent is actually #13 in time spent.
One additional lesson learned (thank you KNIME ): Instead of using String to Date&Time Node, I used the CSV Reader Nodes transformation tab → one node less
Also wanted to work with column expressions, but couldn’t find a way to calculate the time difference with the formulas given (that would then be another node less). Curious how the official solution will solve this
I found myself giggling when I saw the total expenses. I hope that the scale is in hundreds or thousands.
Nevertheless, KnimeIT_21 – KNIME Hub
Here’s my solution.
REF Challenge 21.knwf (639.1 KB)
In my opinion, the data set is difficult with regard to a “standard” correlation analysis. This presupposes that the individual observation points are statistically independent of each other. This is not the case for participants, which appear several times in the data set. Among others, this paper deals with the topic of correlation analysis with repeated measures. A mixed-model approach seems to perform best here. However, this is a method that goes well beyond the pure KNIME capabilities.
Grouping of participants, as seen in many justknimeit solutions, would be one way to deal with this, but is not optimal, as one loses information in the process.
A dataset with only one participant who eats twice in total. Once for 5 minutes for 10 Euro and once for 20 minutes for 20 Euro, i.e. positive correlation. In the aggregated analysis you would see only 10 minutes and 15 Euro.
What do you think? Do you have a suggestion? Or do I see the whole thing totally wrong?
Hi, this is my solution for Challenge #21
I made two outcomes for the challenge, (1) using total amounts regardless of the participation rate, (2) using average amounts based on total amounts divided by participation rate
Hi, here is my solution.
It’s my Day 15 with knime today and this is my first try of Just KNIME It.
(Don’t judge me too hard. )
I enjoyed it, thanks for the opportunity.
Here is my solution, for the just knime it 21 challange.