Package csb :: Package bio :: Package io :: Module wwpdb
[frames] | no frames]

Module wwpdb

source code

PDB structure parsers, format builders and database providers.

The most basic usage is:

>>> parser = StructureParser('structure.pdb')
>>> parser.parse_structure()
<Structure>     # a Structure object (model)

or if this is an NMR ensemble:

>>> parser.parse_models()
<Ensemble>      # an Ensemble object (collection of alternative Structure-s)

This module introduces a family of PDB file parsers. The common interface of all parsers is defined in AbstractStructureParser. This class has several implementations:

Unless you have a special reason, you should use the StructureParser factory, which returns a proper AbstractStructureParser implementation, depending on the input PDB file. If the input file looks like a regular PDB file, the factory returns a RegularStructureParser, otherwise it instantiates LegacyStructureParser. StructureParser is in fact an alias for AbstractStructureParser.create_parser.

Another important abstraction in this module is StructureProvider. It has several implementations which can be used to retrieve PDB Structures from various sources: file system directories, remote URLs, etc. You can easily create your own provider as well. See StructureProvider for details.

Finally, this module gives you some FileBuilders, used for text serialization of Structures and Ensembles:

>>> builder = PDBFileBuilder(stream)
>>> builder.add_header(structure)
>>> builder.add_structure(structure)

where stream is any Python stream, e.g. an open file or sys.stdout.

See Ensemble and Structure from csb.bio.structure for details on these objects.

Classes
  AbstractStructureParser
A base PDB structure format-aware parser.
  AsyncParseResult
  AsyncStructureParser
Wraps StructureParser in an asynchronous call.
  CustomStructureProvider
A custom PDB data source.
  DegenerateID
Looks like a StandardID, except that the accession number may have arbitrary length.
  EntryID
Represents a PDB Chain identifier.
  FileBuilder
Base abstract files for all structure file formatters.
  FileSystemStructureProvider
Simple file system based PDB data source.
  HeaderFormatError
  InvalidEntryIDError
  LegacyStructureParser
This is a customized PDB parser, which is designed to read both sequence and atom data from the ATOM section.
  PDBEnsembleFileBuilder
Supports serialization of NMR ensembles.
  PDBFileBuilder
PDB file format builder.
  PDBHeaderParser
Ultra fast PDB HEADER parser.
  PDBParseError
  RegularStructureParser
This is the de facto PDB parser, which is designed to read SEQRES and ATOM sections separately, and them map them.
  RemoteStructureProvider
Retrieves PDB structures from a specified remote URL.
  SecStructureFormatError
  SeqResID
Same as a StandardID, but contains an additional underscore between te accession number and the chain identifier.
  StandardID
Standard PDB ID in the following form: xxxxY, where xxxx is the accession number (lower case) and Y is an optional chain identifier.
  StructureFormatError
  StructureNotFoundError
  StructureProvider
Base class for all PDB data source providers.
  UnknownPDBResidueError
Functions
AbstractStructureParser
StructureParser(structure_file, check_ss=False)
A StructureParser factory, which instantiates and returns the proper parser object based on the contents of the PDB file.
source code
str
find(id, paths)
Try to discover a PDB file for PDB id in paths.
source code
Structure
get(accession, model=None, prefix='http://www.rcsb.org/pdb/files/pdb')
Download and parse a PDB entry.
source code
Variables
  PDB_AMINOACIDS = {'2AS': 'ASP', '3AH': 'HIS', '5HP': 'GLU', 'A...
  PDB_NUCLEOTIDES = {' M': 'Amino', 'A': 'Adenine', 'B': 'NotA'...
  __package__ = 'csb.bio.io'
Function Details

StructureParser(structure_file, check_ss=False)

source code 

A StructureParser factory, which instantiates and returns the proper parser object based on the contents of the PDB file.

If the file contains a SEQRES section, RegularStructureParser is returned, otherwise LegacyStructureParser is instantiated. In the latter case LegacyStructureParser will read the sequence data directly from the ATOMs.

Parameters:
  • structure_file (str) - the PDB file to parse
Returns: AbstractStructureParser

find(id, paths)

source code 

Try to discover a PDB file for PDB id in paths.

Parameters:
  • id (str) - PDB ID of the entry
  • paths (list of str) - a list of directories to scan
Returns: str
path and file name on success, None otherwise

get(accession, model=None, prefix='http://www.rcsb.org/pdb/files/pdb')

source code 

Download and parse a PDB entry.

Parameters:
  • accession (str) - accession number of the entry
  • model (str) - model identifier
  • prefix (str) - download URL prefix
Returns: Structure
object representation of the selected model

Variables Details

PDB_AMINOACIDS

Value:
{'2AS': 'ASP',
 '3AH': 'HIS',
 '5HP': 'GLU',
 'ACL': 'ARG',
 'AGM': 'ARG',
 'AIB': 'ALA',
 'ALA': 'ALA',
 'ALM': 'ALA',
...

PDB_NUCLEOTIDES

Value:
{'  M': 'Amino',
 'A': 'Adenine',
 'B': 'NotA',
 'C': 'Cytosine',
 'D': 'NotC',
 'DA': 'Adenine',
 'DC': 'Cytosine',
 'DG': 'Guanine',
...