How to use MolScript

Running MolScript

The MolScript executable (molscript) should be directly available as a command, that is, it should be located in your path. To test this, try the following command, which will just output all command-line options available:
% molscript -h
If the program does not seem available, ask your system administrator or whoever installed it.

To get a first input file, use the MolAuto program. This takes a coordinate file as input, and produces a first-approximation input file for MolScript:

% molauto pdb1crn.ent > input.file
By default, MolScript reads the input file from the stdin (standard input) and writes the result to stdout (standard output), while execution messages and error messages are output to stdout (standard output). For example, to produce a PostScript file, give the command:
% molscript -ps < input.file >

The interactive OpenGL mode produces no output, so there is no need specify an output file:

% molscript -gl < input.file
The best way to use the interactive OpenGL mode is to use an explicitly named input file (using the -in command-line option). This allows re-reading the input file after the user has modified it with a text editor, which speeds up preparation of input files. See the documentation on the mouse button functions.
% molscript -gl -in input.file

Using UNIX pipes, one can get quick visualizations of a coordinate file using MolAuto and the interactive OpenGL mode:

% molauto protein.pdb | molscript -gl

Similarly, a VRML 2.0 file can be created quickly with MolAuto:

% molauto protein.pdb | molscript -vrml

Command-line options

Only a single option controlling the output mode may be specified. An output or input file different from the default may be defined only once in the command line. The other options may be specified in any combination. If conflicting options are used, then the last one on the command line determines the result.

Note that in previous versions of MolScript (v1.4 and older) it was possible to specify the output format by giving a keyword in the input file. This is no longer possible.

command-line option action
Chooses the PostScript output mode. This is the default mode, so these options are actually rather useless.
-r [alias]
-raster3d [alias]
Choose the Raster3D output mode. The optional aliasing parameter controls the aliasing scheme: 1 = no antialiasing, 2 = compute 2x2 pixels for each pixel output, 3 = compute 3x3 pixels for 2x2 pixels output. The default value is 3. The file produced is suitable as input for the render program in the Raster3D package.
-vrml Choose the VRML 2.0 output mode.
Choose the interactive OpenGL mode. This may or may not be available in your executable, depending on whether your installation includes the OpenGL implementation or not.
-eps [scale] Choose the Encapsulated PostScript (EPS) image file output mode. The optional scaling parameter value affects the conversion of the OpenGL rendering image size in pixels to to the PostScript default units. The default value is 1.0. Any positive, non-zero value is valid. This mode is available only if the OpenGL mode is.
-epsbw [scale] The same as the -eps option, except that a black-and-white image is produced, by interpreting the pixel colour values in the OpenGL rendering image as luminance values. This mode is available only if the OpenGL mode is.
Choose the SGI (aka RGB) image file output mode. This mode is available only if the OpenGL mode is.
-jpeg [quality] Choose the JPEG image file output mode. The optional quality parameter can be any integer in the range 1-100 inclusive, and controls the degree of compression used for the image: lower values use more compression, giving lower image quality. The default value is 90. This mode is available only if the OpenGL mode is.
-png [compress] Choose the PNG image file output mode. The optional compress parameter can be one of the literal values default, speed, size or none and determines the algorithm used by the PNG library routines when creating the image file. The default value is default. This mode is available only if the OpenGL mode is.
-accum number Use the accumulation buffer for OpenGL rendering, i.e. for the interactive OpenGL mode and all image file output modes. The image is rendered multiple times (with a slight shift each time), summed and averaged, thus producing an antialiased image. The quality improves with the number, but so does the execution time. The parameter value can be any positive, non-zero integer, but is actually always set to the immediately higher value in this list: 2, 3, 4, 8, 15, 24 or 66 (maximum).
-pretty The output file is formatted (indented) to increase the readability. This is useful when the file is to be modified manually. Currently this affects only the VRML 2.0 output mode.
-size width height Set the size of the image for the Raster3D, OpenGL, EPS, SGI, JPEG and PNG modes. The values are the width and height of the image, in pixels. The default value is 500 by 500 pixels. The maximum value allowed is 4096 by 4096 pixels.
Do not produce any run-time diagnostic messages. This may be useful when no output except the graphics is wanted, and the input file needs no debugging.
-out filename The output from the program is written to a file with the given name, instead of to stdout (standard output).
-in filename The input file (script) is read from the file with the given name, instead of from stdin (standard input). This is particularly useful with the interactive OpenGL mode.
-log filename The run-time diagnostics are written to a file with the given name, instead of to stderr (standard error output).
-tmp filename Use a temporary scratch file with the given name. The execution process must be allowed to create, write to, read from, and delete the file. The default is the file used by the ANSI C library standard tmpfile routine. Currently, this is relevant only for the Raster3D output mode.
-h Output a listing of the available command-line options to stderr (standard error output). No other output is produced.

Environment variables

The MolScript program can use one environment variable: MOL3D_PDB_DIR. If this is set to the name of the directory where the local copy of the PDB database distribution is located, it allows one to use just the PDB code in place of the complete file name in the read command. See this command for more information.

A minimal input file

Here is a minimal input file, which will show the trace of the amino-acid chain for the molecule in the coordinate set, after centering all atoms.
   ! --- This is a minimal MolScript input file; this line is a comment.

     read mol "protein.pdb";      ! Read the coordinate file.
     transform atom *             ! Xform all atoms so that the centre-
       by centre position atom *; ! of-gravity is placed at the origin.
     trace amino-acids;           ! Output a CA trace of the chain.

   ! --- Here the minimal input file ends.

Some useful tips

The MolAuto program is almost always the best starting point for making a good image of a molecule. The input file produced by the MolAuto program will contain commands to render both the secondary structure and the ligands. However, there will be no transformation to get a good view.

To prepare a good image of a molecule, the main problem is to find a good view. It is usually not a good idea to work on details such as label positions or colours unless at least an approximate view has been found. With the interactive OpenGL mode, it is fairly easy to find a good view, which is output from the program, and then incorporated into the input file by the user.

If you do not have the OpenGL implementation available, you must adopt an iterative method instead: Use a PostScript display program such as xpsview or ghostview to view the output from the MolScript program. Then edit the input file with an ordinary text file editor, changing the transformation by adding rotations and/or translations. Note that any number of rotations or translations may be used in the transform command. Re-run MolScript, and view the results. Continue until a good view has been found.

Jane Richardson has listed some general points on the design of schematic images of proteins (Richardson 1985). These guidelines are still perfectly valid, although they were formulated for hand-drawn images. Look at her drawings for inspiration and guidance (Richardson 1981, 1985).

An important point to make is that 90 degree views of a structure, if carefully selected, can be very instructive. In fact, such views can often bring home the message better than stereo plots. With MolScript, 90 degree views are ridiculously easy to prepare. Stereo plots, however, are trickier to make.

When displaying superpositions of structures (for instance, sets of structures derived from NMR data), then the store-matrix and recall-matrix commands must be used, rather than the ordinary ways of getting a good view. The reason is that if the position command is used, this will give a slightly different transformation for different coordinate sets, thus destroying the least-squares fit that (almost certainly) has been done previously using some other program. In the display of superimposed NMR structures, this will give plots where it looks as if the structures superimpose better at the center of the plot than they actually do. By using the command store-matrix after the first coordinate set has been transformed, and then recall-matrix for the other coordinate sets, you ensure that the original least-squares fit is kept intact.

How do I...?

Disulfide bonds

Why are not disulfide bonds output by default?

The reason is that the sulfur atoms in disulfide bonds typically have a covalent bond distance of about 2.0-2.1 Ångström, while the default value for the bonddistance parameter is only 1.9 Ångström. So in order to get a bond between two sulfur atoms, this parameter has to be increased to, say, 2.5. Note that this may cause spurious bonds in other parts of your structure, so reset the parameter back to its lower value once the disulfide bonds have been created.

   bonds in amino-acids;
   set bonddistance 2.5;
   bonds atom SG;
   set bonddistance 1.9;
Why is the default value of the bonddistance parameter so low? It is a compromise value intended to avoid problems of spurious bonds in e.g.the five-membered rings of heme groups, while giving reasonable results for other moieties.

Per-residue colours

Per-residue colours were introduced in MolScript v2.0. By switching on the colourparts parameter, all schematic residue-based graphics objects (for example helix and strand) will be coloured according to the current residue colours. These are all white by default. They can be changed with the residuecolour parameter.

Rainbow residue colours

You want to colour the schematic residue graphics objects from blue at the N-terminus along the rainbow colours to red at the C-terminus.

Solution 1:

set residuecolour amino-acids rainbow, colourparts on;
helix from 1 to 10; ! or whatever

Solution 2:

Use the MolAuto program to create the commands for the schematic residue graphics objects. By default, these will be coloured in a rainbow fashion, but on an object-to-object basis, not residue-by-residue.

Select residues close to a residue

If you want to display all residues close to another residue, then use the close selection expression. However, since this expression takes an atom selection as argument, and produces an atom selection, the result must be converted to a residue selection using the contains expression.

As an example, to draw bonds for all atoms in all residues having an atom within 5.0 Ångström of a residue of type RTL, use the following:

   bonds in contains close in type RTL 5.0;

Stereo plots

There is no specific command to make MolScript create stereo plots. Instead, a stereo plot can be made by creating two similar plots in two different output images, the only difference being a slight rotation of all coordinates (about 4-6 degrees) in one of them.

Alternatively, in PostScript output it is possible to make two plots on the same paper (position them with the area command) and draw the same objects in both, adding a small rotation to one of them.

Common problems

"Error: parse error"

You may have forgotten the semi-colon ';' that must be present to finish the previous command, or you may have used an invalid or misspelled command, or you are trying to use a keyword or reserved word as an identifier without using quotation marks. Check the input file at or before the line and item indicated by the error message.

Weird helices or strands

If you use a too generous definition of helices and strand, then most likely there will be strange effects at the terminii. To reduce or remove the problem, just shorten the secondary structure elements by one or two residues. The secondary structure determined by the -ss_hb command-line option in the MolAuto program is usually good, while the command-line option -ss_ca may occasionally be too generous.

The algorithm for helix graphical object generation assumes reasonably good alpha-helical geometry, but can usually handle deviations like proline bends gracefully. Of course, it cannot take care of all kinds of irregularities.

Sometimes, a beta-strand may flip over on its back. This is most often due to prolines or cross-over in the strand, and is difficult to do anything about. Sometimes it may be best to split the strand up to handle the problem.

Difficulty to position label strings

Since parameters such as labelcentre and labeloffset are part of the graphics state, and hence persistent, it is a common mistake to set an offset for a particular label, but forgetting that all labels after that in the input file will also be affected. Just change these parameters appropriately.

Coils/turns not connected to helices/strands

This is most likely due to not giving the proper residue selection for the relevant commands. If residues 1-4 and 7-10 are strands, then the turn in between should be given as "from 4 to 7", not "from 5 to 6". The graphical objects begin and end at the given residues.

Coil radius is not depth-cued

In the PostScript output mode, the coil radius is depthcued only when slab is set explicitly in the input file.

Top page