Does any of the (free) cheminfo nodes help in categorizing molecules if they are acid/base/neutrals?
I almost suspect not since most likely pka needs to be calculated, and such a node isn't available with the free packages (or?). Commercially, Chemaxon/Infocom probably could?
What about salts? RD-Kit has a salt filter node, is it somehow possible to simply flag a compound as salt without removing it, alt removing into another table?
One option, if the molecule is in a SMILES format, might be to use a Java snippet to look for a "." character in the SMILES. It is a bit crude, but it should identify multi-component SMILES, e.g., in a Java Snippet (Simple):
return ($SMILES$.indexOf(".")>=0) ? "Salt" : "Not Salt";
thanks, that is helpful with regards to salts.
I've done this. You can calculate acid and base counts using the PaDEL descriptor counter from the University of Singapore. Then you can categorise structures as acid, base, neutral or zwitterion. The node is free to use.
One problem to be aware of though is that the 'molecule to CDK' node they supply is(was?) old and newer equivalent nodes do not work directly with PaDEL. If passing the molecules through a more uptodate Molecule to CDK node, I had then to save the molecules as sdf in a temporary directory and then reread them, before putting then through PaDEL.
Another danger in the PaDEL package is that it seems to sometimes replace their own 'Convert to CDK' node with the newer one from CDK proper.
It was asked to the folks at NUS to change the name of the node to 'Convert to PaDEL' to avoid confusion and namespace problems, but they do not seem to be very active in resolving this problem that was introduced in 2012 (!).
Thanks, I will have a look. At first try Knime crashed when calculating a larger dataset. 8x cores on an i7 running at 100%....
Maybe you can get around writing temporary files by keeping your original smiles column, write "old" CDK structure, calculate, remove CDK and then, if still needed, use "new" CDK?