is there any way to perform hyperparameter tuning for string values because the “parameter optimization loop” node only supports numbers based on my limited knowledge of knime
welcome to the KNIME community!
There is no direct option in the Parameter Optimization Loop to use String values, but you could encode your Strings as numerical values. Let’s say you have 10 String values; then you would add a new parameter to the parameter optimization with a range from 0 to 9 and step size 1. Afterward, you use the flow variable that has been created for the current iteration to filter rows in a given table, which maps string parameters to integers. Doing so, you can select the string value in the desired learner node.
I created a simple workflow to demonstrate this.
StringParameters.knwf (320.9 KB)
If you have more questions, feel free to ask!
Just to add to what Julian said - we have had a couple of recent requests about adding string support in the parameter optimization loop nodes, and have a ticket open for it (AP-11344). I’ve added a +1 from you on that ticket. Thanks for the feedback!
Thanks for your help, I understand the idea of your solution to handle this situation, but at the level of nodes in knime I got lost. I mean if I understand your solution correctly:
You have created the table creator node with the possible values in the form of a map to each string value, then the row filter node inside the loop to check and include only one row at each loop
after this point, I got lost. Could you explain it to me, please?
I think it will be much convenient if there is a part inside this node dedicated to such a case
You are right, the table creator is just used to create a map. The Parameter Optimization Loop start now transforms the first set of parameters into flow variables. These variables are passed to the Row Filter. Don’t mind the settings in the Filter Criteria tab of the Row Filter node. These are just dummy values to be able to use flow variables for the range checking option of the Row Filter. The actual settings are being set in the Flow Variables tab of the node (rowFilter->lowerBound/upperBound/IntCell). We set the number that was passed by the PO Loop start and therefore filter the specified row. In the next step the row that remains is transformed into a flow variable. If you check the ouptut of the “Table Row To Variable” node, you can see that there is a now a variable called ParameterName. These variables are then passed to the Random Forest Learner which makes use of this parameter which is again set in the Flow Variables tab of the configuration dialog. The trained model is then passed to the predictor to predict the labels of the test set. Afterward, we score the predictions by using the Scorer. This node compares original labels and predictions and creates some basic statistics. Additionally, the node creates some flow variables of these statistics. So, when you open the table of the Scorer node and have a look at the Flow Variables tab there, you can see variables for e.g. Accuracy.
The variables are then passed to the loop end which gathers the defined score and triggers a new iteration of the loop (with new parameters). In the end you will have a table with all parameter combinations plus their scores (depending on the parameter optimization strategy you chose in the loop start node).
The Cell Replacer node in the end is just used to replace the integer values with the string values.
I hope it got a little bit clearer.
Thank you very much for your help. Yes, it is clear now
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.