matched pairs detector


I have some problem using this node can append a sample forkflow just to see that works ...i have tried with an sdf with structure (from RDkit) , ID and activity but fails due to

Execute failed: ("ArrayIndexOutOfBoundsException"): -1


Check that the ID column is nominal/string not integer and the activity column has double values. Generally the activity column may have an undesirable or null cell which may be the cause of the error.

If the number of rows is large, try reducing say to only 10 rows.

Hi there,

The ID should be a string, the Molecule should be RDKit format and the activity should be double type and should not contain missing values.

I don't think it's a matter of table size, as the node has been used with > 100,000 molecules.

I hope this helps.



George P.

When does this node return the MolSanitizer error?

I can only speculate that this happens at some point when RDKit, which is used as chemical expert system in this node, struggles with parsing one of the input molecules. One way to find the problematic structure is to use a chunking loop around the problematic node. Of course, for matched pairs you should process all molecules at once, but at least you could identify problematic structures this way.

You could try canonicalising the structures first (RDKit Canon Smiles), then converting back to RDKit (Molecule to RDKit) and then try the Matched Pairs node again, that way the smiles will all be unified to a consistent format. This may remove the error.


thanks will try guys

In addition to what the others said, make sure that you remove the salts and standardise the structures before you submit them to the RDKit Molecule node.