org.biojava.bio.structure.io
Class PDBFileParser

java.lang.Object
  extended by org.biojava.bio.structure.io.PDBFileParser

public class PDBFileParser
extends java.lang.Object

This class implements the actual PDB file parsing. Do not access it directly, but via the PDBFileReader class.

Parsing

During the PDBfile parsing several Flags can be set:

To provide excessive memory usage for large PDB files, there is the ATOM_CA_THRESHOLD. If more Atoms than this threshold are being parsed in a PDB file, the parser will automatically switch to a C-alpha only representation.

The result of the parsing of the PDB file is a new Structure object.

For more documentation on how to work with the Structure API please see http://biojava.org/wiki/BioJava:CookBook#Protein_Structure

Example

Q: How can I get a Structure object from a PDB file?

A:

 public Structure loadStructure(String pathToPDBFile){
            // The PDBFileParser is wrapped by the PDBFileReader
                PDBFileReader pdbreader = new PDBFileReader();

                Structure structure = null;
                try{
                        structure = pdbreader.getStructure(pathToPDBFile);
                        System.out.println(structure);
                } catch (IOException e) {
                        e.printStackTrace();
                }
                return structure;
        }
 

Since:
1.4
Author:
Andreas Prlic, Jules Jacobsen

Field Summary
static int ATOM_CA_THRESHOLD
          the maximum number of atoms that will be parsed before the parser switches to a CA-only representation of the PDB file.
static java.lang.String HELIX
          Helix secondary structure assignment.
static int MAX_ATOMS
          the maximum number of atoms we will add to a structure this protects from memory overflows in the few really big protein structures.
 boolean parseCAOnly
          Set the flag to only read in Ca atoms - this is useful for parsing large structures like 1htq.
static java.lang.String PDB_AUTHOR_ASSIGNMENT
          Secondary strucuture assigned by the PDB author/
static java.lang.String STRAND
          Strand secondary structure assignment.
static java.lang.String TURN
          Turn secondary structure assignment.
 
Constructor Summary
PDBFileParser()
           
 
Method Summary
protected  java.lang.String getTimeStamp()
          Returns a time stamp.
 boolean isAlignSeqRes()
          Flag if the SEQRES amino acids should be aligned with the ATOM amino acids.
 boolean isParseCAOnly()
          the flag if only the C-alpha atoms of the structure should be parsed.
 boolean isParseSecStruc()
          is secondary structure assignment being parsed from the file? default is null
 void linkChains2Compound(Structure s)
          After the parsing of a PDB file the Chain and Compound objects need to be linked to each other.
 Structure parsePDBFile(java.io.BufferedReader buf)
          parse a PDB file and return a datastructure implementing PDBStructure interface.
 Structure parsePDBFile(java.io.InputStream inStream)
          parse a PDB file and return a datastructure implementing PDBStructure interface.
 void setAlignSeqRes(boolean alignSeqRes)
          define if the SEQRES in the structure should be aligned with the ATOM records if yes, the AminoAcids in structure.getSeqRes will have the coordinates set.
 void setParseCAOnly(boolean parseCAOnly)
          the flag if only the C-alpha atoms of the structure should be parsed.
 void setParseSecStruc(boolean parseSecStruc)
          a flag to tell the parser to parse the Author's secondary structure assignment from the file default is set to false, i.e.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PDB_AUTHOR_ASSIGNMENT

public static final java.lang.String PDB_AUTHOR_ASSIGNMENT
Secondary strucuture assigned by the PDB author/

See Also:
Constant Field Values

HELIX

public static final java.lang.String HELIX
Helix secondary structure assignment.

See Also:
Constant Field Values

STRAND

public static final java.lang.String STRAND
Strand secondary structure assignment.

See Also:
Constant Field Values

TURN

public static final java.lang.String TURN
Turn secondary structure assignment.

See Also:
Constant Field Values

ATOM_CA_THRESHOLD

public static final int ATOM_CA_THRESHOLD
the maximum number of atoms that will be parsed before the parser switches to a CA-only representation of the PDB file. If this limit is exceeded also the SEQRES groups will be ignored.

See Also:
Constant Field Values

MAX_ATOMS

public static final int MAX_ATOMS
the maximum number of atoms we will add to a structure this protects from memory overflows in the few really big protein structures.

See Also:
Constant Field Values

parseCAOnly

public boolean parseCAOnly
Set the flag to only read in Ca atoms - this is useful for parsing large structures like 1htq.

Constructor Detail

PDBFileParser

public PDBFileParser()
Method Detail

isParseCAOnly

public boolean isParseCAOnly()
the flag if only the C-alpha atoms of the structure should be parsed.

Returns:
the flag

setParseCAOnly

public void setParseCAOnly(boolean parseCAOnly)
the flag if only the C-alpha atoms of the structure should be parsed.

Parameters:
parseCAOnly - boolean flag to enable or disable C-alpha only parsing

isAlignSeqRes

public boolean isAlignSeqRes()
Flag if the SEQRES amino acids should be aligned with the ATOM amino acids.

Returns:
flag if SEQRES - ATOM amino acids alignment is enabled

setAlignSeqRes

public void setAlignSeqRes(boolean alignSeqRes)
define if the SEQRES in the structure should be aligned with the ATOM records if yes, the AminoAcids in structure.getSeqRes will have the coordinates set.

Parameters:
alignSeqRes -

isParseSecStruc

public boolean isParseSecStruc()
is secondary structure assignment being parsed from the file? default is null

Returns:
boolean if HELIX STRAND and TURN fields are being parsed

setParseSecStruc

public void setParseSecStruc(boolean parseSecStruc)
a flag to tell the parser to parse the Author's secondary structure assignment from the file default is set to false, i.e. do NOT parse.

Parameters:
parseSecStruc - if HELIX STRAND and TURN fields are being parsed

getTimeStamp

protected java.lang.String getTimeStamp()
Returns a time stamp.

Returns:
a String representing the time stamp value

parsePDBFile

public Structure parsePDBFile(java.io.InputStream inStream)
                       throws java.io.IOException
parse a PDB file and return a datastructure implementing PDBStructure interface.

Parameters:
inStream - an InputStream object
Returns:
a Structure object
Throws:
java.io.IOException

parsePDBFile

public Structure parsePDBFile(java.io.BufferedReader buf)
                       throws java.io.IOException
parse a PDB file and return a datastructure implementing PDBStructure interface.

Parameters:
buf - a BufferedReader object
Returns:
the Structure object
Throws:
java.io.IOException - ...

linkChains2Compound

public void linkChains2Compound(Structure s)
After the parsing of a PDB file the Chain and Compound objects need to be linked to each other.

Parameters:
s - the structure