Salt stripping

Hi there,

Just been using the KNIME tripos extensions, and have found them a much improvement on the CDK components, particularly the SDF file reader. The SDF file reader should be like the one you have supplied where a structure is returned with associated other fields.

Anyhow, the real question I have is there an easy way to enable salt stripping from an SDF file, I can filter the columns, but I would appreciate an easy way to filter salt forms, and strip them. Have you anything that would serve this function?

Best regards,

Stanage.

Hi Stanage --

The SD File Reader that is part of the Tripos Extensions package should permit you to read in data fields as well as the structure information from your SDF. In the node dialog, after first supplying the name of a valid SD file, a table should populate with the data fields discovered in your SD file. If you do not see this table being populated with data fields that you know to be in your SD file, perhaps there is a problem. I would appreciate your sending me such an example SD file if the procedure I describe above does not result in the behaviour you expect/want.

Salt stripping of structures (whether from SDF or any other format) is a non-trivial procedure. Tripos's UNITY tool, dbstripsalt, performs exactly this task. We are working on a KNIME node for dbstripsalt but it is not yet available. Other tools, such as JOELib, offer some salt stripping functionality but appear to only return the largest fragment (which might not be what you want, as sometimes the salt is larger than the desired compound.) I am unfamiliar with any existing KNIME node that can provide the functionality you seek, but I can say that the dbstripsalt node is forthcoming.

If you believe your structures with salts are sufficiently similar to one another, you might consider trying to create a script using the JPython node to remove the salt from the SD structures. This also requires a certain comfort level with writing in python, so I am unsure if this suggestion is helpful.

Davin

Hi Davin,

Thanks for the reply. Well I have figured a way to do this, but it is pretty long winded, and is best described as fragment splitting. The approach I used, employs both the tripos nodes, as well as a CDK node (String to SMILES). The pipeline design is as follows, unfortunately I cannot upload a screenshot of the pipe, so here goes to represent it;

SDF file (TRIPOS SD File Reader Node)
|
COLUMN FILTER (select columns from SDF file for use)
|
2D PROPERTIES (Tripos Node) - add column to calculate the number of fragments
|
ROW FILTER (KNIME node) - split columns dependent on number of fragments detected
| |
| |
| |
| SDF WRITER (Tripos) - all rows that show one fragment ie. no salt
|
OPENBABEL (TRIPOS node) - convert SDF structure to SMILES str
|
JAVA SNIPPET (KNIME Node) - look for SMILES Str delimiter ('.'), copy largest fragment
|
STRING TO SMILES (CDK Node) - convert normalised SMILES str from String to SMILES
|
SD WRITER (Tripos) - write out a new SDF file

The java code is pretty simple once you get the hang of it (see below, copy and paste into the java snipnet node, this is a bit of code that Berhard had posted elsewhere on the forum).

String result = ""; 

int firstOccurence = $Converted molecules (smi)$.indexOf(’.’);

if (firstOccurence >= 0){
result = $Converted molecules (smi)$.substring(0, firstOccurence);
}
else
{
result="";
};
result

In the snipnet node select the following;
Append column (select)
Return type - String

The CDK node is only used to convert the column type to SMILES, so that you can use the other TRIPOS components to view the structures. Anyhow, as I said it is a bit long winded, but once made it can be made a bit more flexible.

Stanage

Hi Stanage --

Cool. Sounds like a good solution and I especially like the use of the Java Snippet node.

Is it the behaviour of the OpenBabel node (not actually from Tripos, but from KNIME authors and openbabel.org) to always put the largest fragment first in the SMILES string? Looks like in your situation the largest fragment IS indeed what you want.

Davin

Hi Davin,

I will check that, but it should be easy to take the SMILES str, and then look for the largest fragment. If I have time, I will ahve a go, and post the results.

Best regards,

stanage

Hi Davin,

As promised here is a bit of Java that will determine either the largest or smallest detected fragments, and return the corresponding SMILES string. I am not a Java programmer, just an ex bench scientist, so please if any one can do better, please feel free, just post the results for all to share.

Best regards,

stanage.

String result = "";
String fragmentArr = "";

//column containing SMILES Str code
String smiles = $SDF structure (translated)$;

// split SMILES Str using ‘.’ delimiter into an array
String [] myArr = smiles.split ("\.");

// set largest array element - assumed to be array element 0
int longest = 0;

// determine the number of array elements
int arrLEN = myArr.length;

// loop through array elements and compare element sizes
for (int i = 1; i < arrLEN; i++)

// set > or < to return smallest or largest detected fragments
if (myArr[i].length() < myArr[longest].length())
{
fragmentArr = myArr[i];
}
else
{
fragmentArr = myArr[longest];
};

result = fragmentArr;

result