Parsing PDB file info

evert.homan_scilifelab.se · April 3, 2020, 1:22pm

Hi,

I want to manipulate a large number of PDB model files (PDB format) but cannot parse any info from the files other than the coordinates. I have tried several PDB readers, (e.g. Vernalis Load Local PDB Files --> PDB Property Extractor) but nothing is parsed. Presumably there is something funky with the file format not being completely PDB adherent.

Attached an image of what the cell content looks like. I would like to parse the Remark 1 lines.

Thanks/Evert

s.roughley · April 3, 2020, 2:41pm

The space between ‘REMARK’ and ‘1’ looks bigger than I might expect off the top of my head. Could you copy one of the cells and paste into a text editor and let me know how many spaces there are?

I’ve got a very poor internet connection at the moment and so cant check the format specification or source code, but if you can reply with the answer too the above then I will check once I have my connection back.

Steve

evert.homan_scilifelab.se · April 3, 2020, 6:15pm

REMARK 1 MODEL FOR 5ht1a_human

Looks like there are 4 spaces…

Best/Evert

s.roughley · April 4, 2020, 10:12am

This is from the format specification (http://www.wwpdb.org/documentation/file-format), pg47:

and particularly for REMARK 1 (pg 49):

So there should only be 3 spaces. This can be seen for e.g. PDB ID 2wi7:

REMARK   2                                                                      
REMARK   2 RESOLUTION.    2.50 ANGSTROMS.                                       
REMARK   3                                                                      
REMARK   3 REFINEMENT.                                                          
REMARK   3   PROGRAM     : REFMAC 5.5.0066                                      
REMARK   3   AUTHORS     : MURSHUDOV,VAGIN,DODSON                               
REMARK   3                                                                      
REMARK   3    REFINEMENT TARGET : MAXIMUM LIKELIHOOD

I guess the simplest fix is to ‘tweak’ the PDB column you have in a Java Snippet so that it complies with the format specification as shown in the attached example PDB REMARK 1.knwf (18.4 KB) .

One caveat about these nodes is that they take REMARKs 1-3 and extract them as a single line of text, although some of the other outputs are created by a slightly ‘smarter’ handling of those REMARK fields. This is because historically these fields were sometimes used internally for descriptive notes rather than as defined in the file format. If you think an option in each case to retain line-breaks would be useful, then do let me know.

Steve

evert.homan_scilifelab.se · April 4, 2020, 2:54pm

Brilliant Steve, many thanks, very educational in many aspects.

Happy Easter/Evert

system · October 4, 2020, 2:54am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.