Molecule coordinate commands

The MolScript coordinate system
Molecule coordinate data
read
transform
store-matrix
copy
delete

The MolScript coordinate system

There is one single fixed coordinate system in MolScript: The viewer looks down from the positive z axis towards the origin. The x-y plane is perpendicular to the view direction. The coordinate system is right-handed; x increases from left to right, and y increases from the bottom upwards. The distance unit is Ångström.

The viewpoint cannot be changed (however, see the viewpoint command). To view the molecule from another angle, the atomic coordinates have to be transformed (i.e. rotated and/or translated) within the coordinate system.

The copy and delete commands allow manipulation of the coordinates read into MolScript in powerful ways. For instance, one may have available the transformation matrices that define a symmetrical oligomer of a protein structure. In MolScript, it is easy to create the constituent subunits from the original coordinates using the copy and transform commands.

Molecule coordinate data

The molecule coordinates are read using the read command, and are then stored internally. The various atom and residue selection expressions operate on the molecule data and yield selections for the graphics commands to render. Currently, MolScript reads only PDB format coordinate files.

Name and type strings

When the coordinate data has been read, some processing is done on the strings in it: the strings for the residue names (=numbers), residue types, and atom names are left-shifted, and blanks are squeezed out, to facilitate name comparisons involving these items. As a special case, any '*' (star) characters in atom names are changed to ''' (single-quote) to avoid a clash with the X-PLOR type wildcard character used in MolScript. This is relevant for nucleic acids in PDB format; there, primes are represented by '*' characters.

Atomic radii and colours

Default atomic radii and colours (see parameters atomcolour and atomradius) as well as residue colours (parameter residuecolour) are automatically set by MolScript after the coordinate file has been read.

Segment identifiers

The current version of MolScript handles the segment identifier (segid) in the new (v2.0) PDB format. The segid was introduced as the chain identifier in X-PLOR (Brünger 1992). If present, it can be accessed in the residue selection segid.

Atomic elements

The current version of MolScript interprets the atomic element as given in the new (v2.0) PDB format. If the PDB file is of the older format, then a reasonable guess as to the element type based on the atom name is made. This value is accessible as the atom selection element.

MODEL

PDB coordinate files in the new format (v2.0 and higher) may contain a keyword MODEL indicating that the file has several coordinate sets for the same molecule. This is for NMR protein structures, which are most often determined as sets of 3D structures. The models are stored in MolScript as separate molecules, having the same molecule name, but different model numbers, which can be used in the residue selection model. Molecules read from files containing no MODEL records (e.g. an X-ray diffraction structure) will have the model number 0.

read molname PDB-code-string ;
read molname filename-string ;
read molname inline-PDB ;

The read command has three forms, all of which need two arguments. The first argument is the molecule name to assign to the coordinate data. It may be any valid identifier. The molecule name can be used in the residue selection molecule.

PDB code

The first form of the read command is used if the second argument is a valid PDB data set code, having four characters where the first character is a digit. The full PDB file name is constructed from the environment variable MOL3D_PDB_DIR, which must be defined. Three different directory and file name conventions are tested consecutively, and the first that works is used:

PDB standard hierarchical organisation and file name:
for code 1XYZ the file is $MOL3D_PDB_DIR/distr/xy/pdb1xyz.ent
PDB flat directory, standard file name:
for code 1XYZ the file is $MOL3D_PDB_DIR/pdb1xyz.ent
PDB flat directory, simplified file name:
for code 1XYZ the file is $MOL3D_PDB_DIR/1xyz.pdb

The given PDB code is changed to lower case characters, and the other strings used to construct the file name are all in lower case, except the environment variable value which is used as is.

file name

The second form of the read command is used when the second argument is not a valid PDB data set code. The second argument (optionally enclosed by double-quotes '"') is then interpreted as a complete coordinate file name. In the current version of MolScript, only PDB files are supported.

inline PDB

The keyword inline-PDB (no double-quotes!) as the second argument in the command instructs MolScript to read the subsequent lines of the input file (after the semi-colon ';') as a PDB file, until a PDB record with the keyword END appears.

This inline PDB part of the input file is unusual in that it is not free-format, but must follow the PDB format rules exactly. The usual MolScript interpretation of the input file is resumed after the line containing the PDB END record. If there is no END record in the PDB data, then an error will (most likely) occur as MolScript tries to interpret the rest of the MolScript input file as a PDB format file.

This feature allows the creation of entirely self-contained MolScript input files, which contain both coordinate data as well as transformation and graphics commands. It is intended mainly as a format that other programs (such as molecule display or analysis programs) could generate.

transform atom-selection { by operation } ;

The coordinates of the selected atoms are transformed by a matrix. The matrix is defined by a set of one or more operations, each preceded by the by keyword. The operations are conceptually applied to the atom coordinates in the specified order, one operation after the other.

Any number of operations may be specified. This means that it is not necessary for the user to modify any previous operations if a modification of the transformation is needed; one can just add the operations needed to achieve the desired transformation. The available operations are listed below.

Each operation generates a 4x4 transformation matrix. These matrices are sequentially concatenated into one single matrix before the transformation is actually applied.

Note that it is valid to transform a selection containing no atoms. This may be useful when a current transformation matrix needs to be established for the object command, without changing the coordinates of any atoms.

centre vector

Transform the atom coordinates so that the point specified by the given vector is moved to the origin.

This is most often used with the position command, which produces the geometrical centre-of-gravity of its atom selection. Note that this will refer to the atom coordinates as they are before the operations of this particular transform command have been applied. Therefore, this particular operation should be used as the first one in a transform command.

translation vector

Translate the atom coordinates by the specified vector.

rotation x number
rotation y number
rotation z number

Rotate the atom coordinates around the specified x-, y- or z-axis of the fixed coordinate system. The value is given as an angle in units of degrees, where a positive value rotates counter-clockwise around the given axis when viewed from the positive axis down towards the origin of the fixed coordinate system.

rotation axis vector number

Rotate the atom coordinates around the axis specified by the given vector. The value is given as an angle in units of degrees, where a positive value rotates counter-clockwise around the given axis when viewed from the positive axis down towards the origin of the fixed coordinate system.

rotation
number number number
number number number
number number number

Rotate the atom coordinates by the explicit specified 3x3 rotation matrix. The validity of this matrix is not checked by MolScript. If it is incorrect, the coordinates will be corrupted.

This feature is used most often together with the interactive OpenGL mode, which allows interactive rotation of the graphics objects, and can output the value of the rotation matrix needed to achieve the current view. This matrix can then simply be cut-and-pasted into the appropriate transform command in the input file.

recall-matrix

Use the matrix stored by the most recent store-matrix command. If no such command has been given, then this operation has no effect.

store-matrix ;

Store the transformation matrix defined by the most recent transform command. The stored matrix can then be used in a subsequent transform command by the recall-matrix operation. Only one matrix can be stored.

The stored matrix is not affected when the current plot is finished. This means that for the output formats in which more than one plot can be specified in one single input file (currently only PostScript), it is possible to defined a transformation in one plot and carry it over into another plot.

This feature is used for example in cases where the exactly same view is needed for different coordinate data sets. One such application is the display of multiple structures determined from NMR data, where the proper least-square fit has already been determined using some analysis software, and it is essential that the relative orientations of the coordinates are not disturbed by MolScript.

copy molname atom-selection ;

The copy command operates on an atom selection. A new molecule with the given name is created, and all atoms selected, and all residues which these atoms are part of, are copied to the new molecule. The new molecule is a completely self-contained entity which in no way is dependent on the atoms it was copied from. Since an atom selection is used as the argument in the command, it is possible to create a new molecule containing any subset of atoms and residues from another.

delete molname ;

Delete the entire molecule defined by the given molecule name. The molecule name is freed for use in a subsequent read or copy command.

Top page