I am right now trying to test the Active Learning Nodes available in KNIME. While doing that I encountered some questions and possible problems:
- I don’t see any way to have some multi-label output support? Do I just miss something?
- When stopping the End Loop, I cannot re-execute it. It throws error:
ERROR Active Learn Loop End 3:16 Caught “IllegalStateException”: No file store handler set on node
- Within the loop I split the model to a second predictor, that should evaluate the current performance on the fully labelled set. At best I would like to automatic write the performance out to the disk. But the problem is: This nodes don’t get reavaluated. They are not even reset. So basically they don’t notice that the downstream model has changed at all.
- Is there a way to incorporate ouput of multiple models for one sample (To enable the discovery of the Hypothesis space in some kind? -> “Bag Of Models”). (I guess there is a way due to the PMML; I didn’t look into that at the time of writing.).
- If 4. is possible would it be possible to remove / add models without resetting the nodes?
- Closely related to 5: Is there a way to change the unlabelled pool? Like adding / removing samples? Changing the unlabelled pool completely or enable some stream based pool? (Read: Some kind of “Active Pool Management”)?
- Is there a way to incorporate external input to the loop process? For example I like to develop a visualization for the data presented to the user (or maybe some network process that sends the data in question to somehere else) and feed the label back into the running system?
You find a example workflow as attachement (Especially for points 2 and 3).
I know, quite some questions
You propably already noticed: I am striving for some interactive learning methods; Basically adding visualizations and human-in-the-loop ideas etc.
Thanks in advance for reading and maybe answering!
Active_Test.zip (53.5 KB)