Generating Predictive Models with Fingerprint Data


I have used the predictive model facilities (Weka and Mining sections) several times to build models around correlating PhysChem data of molecule such as Mw, LogP etc to things like Solubility.

However, I would like to use these model facilities to build structural QSAR models, i.e. taking the fingerprints of molecules (using either RDKit or Indigo nodes) and then trying to correlate these fingerprints to things like Metabolism or Activity. The trouble is, is that the Weka nodes do not seem to work when a Fingerprint column is present (is this a bug?). Also I havent found a good model in the Mining section to work with Fingerprint data.

Has anyone else had good success with building models using Fingerprint data ? If not, are there any plans to build some specific Machine Learning Algorithms which work well with structural/fingerprint data ?



Hope you are not using the FP's as is and are converting them to 0's and 1's.

Thanks, I was just trying to using them as a Fingerprint column, I now have them as 0's and 1's in individual columns.