Tuning SVM cost hyperparameter

MarcB · August 2, 2020, 8:37pm

Good evening,

I am using SVM for binary classification in the setting of more predictors (367) than samples (in the order of 200). I have normalized my predictors, all quantitative, but I do not want to use PCA to ease interpretation, and I used a high correlation filter instead (workflow attached, but cannot upload with data due to excessive weight). I want to tune hyperparameters (cost and sigma), and I have used 2 SVM configurations. In the upper one, I used the SVM Learner, and in the lower one I used the LIBSVMLearner instead. Specific questions:

In the SVMLearner, is the “Overlapping penalty” equivalent to the “Cost” parameter?
With both, I have problems configuring the flow variables (Cost and Sigma) in the Parameter Optimization Loop Start node. How can I do it properly? The flow variables do not appear in the Flow variable section of the Learner.

Z_SVM example 1.knwf (56.7 KB)

I came across a KNIME video on YouTube (https://www.youtube.com/watch?v=IlqepyIba6Y), but I can’t locate the error. I am using KNIME version 4.1.3.

Thank you,
Marc

ScottF · August 4, 2020, 6:07pm

Hi @MarcB -

As best I can tell, the Overlapping penalty is equivalent to the cost function.

As for some of your other problems, I think the failure of the LIBSVM Learner node stems from an inconsistent looping setup between the top branch and the bottom branch. Second, you need to make sure you are applying flow variables and not creating new ones in the Learner:

Also, you have to be a little careful about how you pass flow variables to the Loop end so it can decide how to optimize (based on accuracy or AUC).

I’ve attached your workflow with some changes I’ve made.

Z_SVM example_SF.knwf (3.3 MB)

There is still pretty poor performance here, likely due to the same issues discussed in your other threads.

MarcB · August 9, 2020, 5:59pm

Thank you, @ScottF. The providers are reviewing the data, but in the meantime I am preparing the workflow. I will carefully review your post.

Best regards,
Marc

joshuahoran · August 9, 2020, 6:39pm

Awesome. I’ve been using KNIME for years and am now just learning what those mysterious empty text fields do – they create new downsteam flowvariables. Thanks!

ipazin · August 10, 2020, 9:01am

Hi @joshuahoran,

it is all in the docs but probably adding some headers wouldn’t do any harm
https://docs.knime.com/latest/analytics_platform_flow_control_guide/index.html

Br,
Ivan

system · February 8, 2021, 9:02pm

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.