NSGA-II algorithm and classification problems

I have been reading about the NSGA-II algorithm and its diverse applications in data mining.

However, the example on the Knime server does not give much clue about applications other than that of bio- and cheminformatics area.

Is there anybody whom might have applied NSGA-II algorithm node to classification problems, especially in conjunction with the SVM?



Hi Bora, 

There is indeed such a node, although I have never used it.  It is part of KNIME Labs, which you will need to install if you haven't already.  See below for a link to the node description. 




Hi Aaron,

I have seen the node and the example on the server, but the example seems to give too little information about the node's (and the algorithm's) diverse applications as thay are citated in various papers I have read so far on the net.

I am curious if anybody has ever used Knime's multiobjective subset selection node for classification problems like the iris data set? Having such an example workflow would be nice to understand the NSGA-II algorithm in a wider perspective.



The NSGA-II node in KNIME is "only" suited for subset selection. I doubt that it can be (ab)used to perform some kind of classification because it optimizes certain criteria based on a row subset. I don't see how this can be used for classification because there is no model that is created.

Genetic algorithms in general are usually not used for classification, but for optimization. Sometimes they are used inside traditional classification algorithms (e.g. neural networks) or around them, e.g. for feature selection. None of these are yet implemented in KNIME, though.

Hi Thor,

I may have not told my intention well enough in the previous post, regarding NSGA-II. How can we use NSGA-II as a  feature selection tool for any classifcation problem when using SVM or artificial networks?

In KNIME you currently can't. You would need a special set of nodes (similar to the existing backward feature elimination nodes). They can make use of the existing NSGA-II implementation but it will require writing new KNIME nodes.

Hi @thor -
Do you know of any updates since this old post to use a genetic algorithm (or any other approach) to optimize the result of a regression model? I’d like to build a regression ML model to predict a dollar value, then put an optimizer on top of that model to share the best feature values to create the highest profit $.


I’m not sure what exactly you want to optimize in a regression model. Usually this is a deterministic procedure without any parameters (except maybe the grade but there you have only very few choices).
Apart from that we didn’t add more GA optimization techniques.

When I say regression model, I just mean predicting a continuous value instead of a classification problem. I want to train a model to predict profit. Then have a user ‘simulate’ by entering new data to run through the model - for example age of the customer, whether or not an associate offered to help them in store, average dollar amount, etc. I can return the model’s prediction using the input values, but I’d also like to make recommendations about which levers (features) the user should increase/decrease to increase profit - rather than the values they’ve input. I’m trying to answer - what combination of feature values will create the maximum profit given some constraints?