Hi there,
I have found a bug in the TRIPOS SD writer.
If you try and export an SDF file containing any string that is greater than 80 characters in length, then a carriage return/line feed is inserted into the string. So for example, a SMILES Str greater than 80 characters in length has a carriage return inserted after every 80 characters, therefore damaging the SMILES representation.
Best regards,
Stanage.
Lines in a molfile cannot extend beyond 80 characters. So says the spec from MDL, in the statement "Molecule name. This line is unformatted, but like all other lines in a molfile may not extend beyond column 80." The reader should concatenate lines together to have a chance of this working portably. And don't even think about how to handle long lines that contain the string "> <" and fold over at just the wrong point.
Quote:
And don't even think about how to handle long lines that contain the string "> <" and fold over at just the wrong point.
Actually this shouldn't be an issue for data, since the data field is terminated with a blank line. Only then can you start a new data header (starting with the >).
jdurant wrote: Actually this shouldn't be an issue for data, since the data field is terminated with a blank line. Only then can you start a new data header (starting with the >).
Yes, you're right. Hmm, okay, here's a less likely case. 80 spaces or so, followed by "> <". If the fold is just right, that will put a line full of spaces by itself, then the "> <", which might get confused with a new property definition.
While the lines of spaces is not a blank line, some software will get confused by it. I see that OpenBabel does "Trim(line);" before it tests to see if the line is empty.
BTW, yes, this is esoteric. :) The main point is that for portability reasons the lines should have less than 80 characters.