Input data format for Model Acceptability Criteria node

evert.homan_scilifelab.se · February 12, 2017, 4:26pm

Hi,

I'm struggling with the input format for the Model Acceptability Criteria node, which seems to require vector-based input data.

I have a data set described by RDKit descriptors, and convert this to a bitvector using the Create Bit Vector node, but the Model Acceptability Criteria keeps complaining that 'Inputs 1, 2, and 3 should be vectors'.

How should I format the input data?

Thank you,

Evert

evert.homan_scilifelab.se · February 17, 2017, 10:55am

It turns out that this node only needs the experimental and predicted data columns, no descriptors or anything else as input (thanks to Daniel Mucs, SweTox).

Cheers/Evert

novamechanics · June 16, 2017, 10:51am

This node is applicable only for continuous models (not classification).

For the specific nodes there are 3 inputs:

0    Values for the dependent variable, predicted by the model (ypred)
1    Values for the dependent variable for the test set (yexp)
2    Values for the dependent variable for the training set (ytr)

For each input you need one vector (the dependent variable) not the descriptors.

Please note that you need to pass to the node only values of the dependent variable not the whole data (you can put a splitter before the node for this job).

The values of the dependent variable are also needed, please have a look at the attached paper in which Tropsha’s equations are included (eq. 3, 4 & 5).

Since there are several questions about the training set. The training set (ytr) is used in equations 1 and 3 (actually what is used is the averaged value for the dependent variable for the training set).

30_chemintellab2013.pdf

baarzo · October 14, 2020, 1:54pm

Hi,

I have problems with the input of this node, i transform only the experimental and predicted values in vector using the “Create Bit Vector” node, but after the transformation the “Model Acceptability Criteria” node doesn’t work because need data or all “double” or all “intreger” but my experimental data are all “double”

evert.homan_scilifelab.se · October 14, 2020, 2:13pm

Hi,

There is no need to generate bit vectors, you should use double values as input for all ports.

Best regards/Evert

baarzo · October 14, 2020, 2:50pm

thanks!!!

But now, why input 1 and 2 should have equal lenght?

test set is a different set respect the training set

baarzo · October 14, 2020, 2:54pm

ok I understand ports 1 and 2 are ports 0 and 1 ok thanks!!

evert.homan_scilifelab.se · October 14, 2020, 3:00pm

This is a bit misleading, what is meant is that you only need the column with the dependent variable as input for the Model Acceptability Criteria node