Active Learning Extension: Questions and Problems

mereep · October 29, 2018, 2:14pm

Hello,

I am right now trying to test the Active Learning Nodes available in KNIME. While doing that I encountered some questions and possible problems:

I don’t see any way to have some multi-label output support? Do I just miss something?
When stopping the End Loop, I cannot re-execute it. It throws error:

ERROR Active Learn Loop End 3:16 Caught “IllegalStateException”: No file store handler set on node

Within the loop I split the model to a second predictor, that should evaluate the current performance on the fully labelled set. At best I would like to automatic write the performance out to the disk. But the problem is: This nodes don’t get reavaluated. They are not even reset. So basically they don’t notice that the downstream model has changed at all.
Is there a way to incorporate ouput of multiple models for one sample (To enable the discovery of the Hypothesis space in some kind? → “Bag Of Models”). (I guess there is a way due to the PMML; I didn’t look into that at the time of writing.).
If 4. is possible would it be possible to remove / add models without resetting the nodes?
Closely related to 5: Is there a way to change the unlabelled pool? Like adding / removing samples? Changing the unlabelled pool completely or enable some stream based pool? (Read: Some kind of “Active Pool Management”)?
Is there a way to incorporate external input to the loop process? For example I like to develop a visualization for the data presented to the user (or maybe some network process that sends the data in question to somehere else) and feed the label back into the running system?

You find a example workflow as attachement (Especially for points 2 and 3).
I know, quite some questions

You propably already noticed: I am striving for some interactive learning methods; Basically adding visualizations and human-in-the-loop ideas etc.

Thanks in advance for reading and maybe answering!

Active_Test.zip (53.5 KB)

gab1one · October 29, 2018, 4:46pm

Hi @mereep,
I am very happy to answer your questions regarding the Active Learning Extension, as I am it’s author .

Do you want to give one datapoint several labels? This is not supported, what you can do is create a combined label, e.g. “label1;label2”, but that it probably not what you want.

How did you stop it? I was not able to reproduce that issue, so some additional details would help me out.

You need to convert that Metanode to a Wrapped Metanode, then you can connect a FlowVariable from that node to one within the loop, then it will be executed every iteration: Selection_032

If you mean that you want to train several models on the same data and combine the outputs, then yes that is possible, you just need to join the result tables of the predictor nodes. Also you can take a look at the Ensemble Learning category in the Node Repository there you will find nodes dedicated to combining models, as well as the Meta Learning Example workflows: https://nodepit.com/server/public-server.knime.com:80/04_Analytics/13_Meta_Learning.

You can use the switch and CASE nodes to disable and enable parts of a workflow dynamically: Switches — NodePit

That is currently not supported directly, you would need to embed the Active Learning loop in a bigger loop construct, which re-executes the Active Learning loop with the changed dataset, you could configure this in a way that you do not loose the created annotations, e.g. via a Recursive Loop.

There are several ways in KNIME to include external tools, e.g. via REST, so this should be no problem, however there is not Active Learning LoopEnd node that can handle this case correctly. When I created the Active Learning nodes for my Bacherlor Thesis, I developed a prototype for such a node, which never got released though: knime-activelearning-js/org.knime.activelearning-js/src/org/knime/al/js/nodes/webportal/variableloopend at master · gab1one/knime-activelearning-js · GitHub, If you are interested I can try releasing that node on an external update site.

best,
Gabriel

mereep · October 29, 2018, 5:40pm

Hi and thanks for that fast answers!
Great work implementing this for “just” a bachelor thesis

I right now also tried to implement only a very simple node, its not too easy to get started with. Still cannot get the Image Extension to wake up within Eclipse, but want to access the org.knime.knip.base.data.img.ImgPlusValue Well thats another topic.

I just right-clicked the node and hit “cancel”. I checked again: You have to at least label one instance within the loop to get that error. If the state is correctly saved within the attachement: You can just try to run the Loop End Node.

For the other stuff I will come back as soon as I could try out.

Thanks again

gab1one · October 30, 2018, 9:22am

If you want to develop Image Processing compatible extensions you need a different SDK setup:

Ok I reproduced that issue, you are supposed to end the training using the terminate button in the view, you can pause it via the pause button. Anyway this is an ugly bug, so it needs to be fixed.

best,
Gabriel

mereep · October 30, 2018, 11:32am

Hi thanks for the answer again. I have this already in my workbench, but I cannot activate it. It throws a million of errors trying to do it. Nevertheless the vanilla version is working.

Well: I see, you are also maintainer there and fixed the problem already in github; will test it. Great

gab1one · October 30, 2018, 2:17pm

I just released that node on an external update site for you to try out: https://dl.bintray.com/gab1one/active-learning-extras/, be advised that it is still rough around the edges
best,
Gabriel

zioludo · July 8, 2020, 6:36am

Hello Gabriel,

I am not sure if this is the right spot to ask the question.
I am also trying to implement an active learning node for a simple classification of products based on text description (using Palladian) but I am lost with the basics. How to stop the loop at the interactive labeller node and make sure I input the labels? Shall I close the labeller and apply settings at the end? How can I get into the interactive labeller at the next iteration?

See example

Thanks

Ludovico

gab1one · July 8, 2020, 8:06am

Hi @zioludo,
I recommend you to look at the blog series by my colleague @paolotamag: https://www.knime.com/blog/guided-labeling-blog-series-model-uncertainty
It explains the active learning process and how to implement it in KNIME.

best,
Gabriel

paolotamag · July 8, 2020, 8:46am

Hi @zioludo,
KNIME Analytics Platform offers you a command called: Do one loop step.
2020-07-08_10h35_35
The right approach is to select the loop end and use this button to perform only one iteration.

So:

Right Click on Labeling View > Execute and Open View
Apply labels and then in bottom right corner of view, Apply and Close or Close&Apply

2020-07-08_10h37_57

Select Loop end node and select in the top of the toolbar “Do one loop step” twice.
Open the view again, you are looking at the second iteration of the Human-in-the-loop iteration

In the second iteration you cannot label anymore unless you stop the execution of the loop.
The idea is that you cannot change the settings of any view or nodes once the loop started.
The KAP for the new Active Learning nodes released in KNIME 4.1 only work for the KNIME WebPortal which come with KNIME Server. You would need to create a Component on top of the Labeling View, deploy the workflow to KNIME Server and open in KNIME WebPortal.

In that case the Labeling View is displayed and you can label just like in KNIME Analytics Platform but instead of “Close&Apply” you will have a “Next” button to go to the next iteration and reload again the labeling view with more data points for any amount of iterations.

If you do not have access to any KNIME Serve I would suggest using the legacy Active Learning nodes developed by @gab1one You can find a good example here with a comparison of the Legacy Active Learning loop and the new Active Learning loop made for the WebPortal. With the old loop nodes you need to Right Click and Open the View of the Active Learning Loop End instead of a View in between the two nodes. This works great in KNIME Analytics Platform but not in KNIME WebPortal where you can deploy your application to more users.

I hope this was useful.
Cheers
Paolo

paolotamag · July 8, 2020, 8:51am

Feel free to also engage here as well:

zioludo · July 10, 2020, 7:54am

Dear Paolo,

Thanks for the clear and detailed reply: I got it.

I have later noticed your workflow annotations: the answer was there but I was too busy trying to understand your “wizardry” . The nodes and your examples are impressive.

Molte grazie!

Ludovico