Stockholm Bioinformatics Center, SBC
Lecture notes: Molecular Bioinformatics 2001,
Uppsala University
Lecture 26 Jan 2001
Per Kraulis
4. Analysing a genome
So, what does one do with a complete genome? After all, a sequenced
genome consists only of so many bases in a defined order. Analysis is
obviously necessary in order to obtain biologically interesting
information.
The analysis of a genome covers many different aspects. Here follows a
list of the most common ones, but it is clear that entirely novel ways
of analysing a complete genome can be invented. The potential for
interesting discoveries in the complete genomes is great; we have
probably just scratched the surface so far.
- Define the location of genes (coding sequences, regulatory
regions): gene prediction (identification).
- Gene prediction ab initio using software based on rules
and patterns. Find Open Reading Frames (ORFs), with additional
criteria for good start sequence for a gene. This is considered
reasonably easy for bacteria, but is very difficult for eukaryotes.
- Gene identification through alignment with know proteins and
EST sequences (Expressed Sequence Tags; mRNA sequences).
- Gene prediction through similarity with proteins or ESTs
in other organisms.
- Gene prediction through comparison with other genomes;
conserved regions are probably coding or regulatory regions. This is
called synteny, and is very promising for analysis of higher
eukaryote genomes.
- Annotation of the genes: Compare with genes/proteinsof known
function in other organisms. This is essentially the same as labelling
the gene.
- Functional classification. Broad groups of functional
characterization, such as 'ribosomal proteins', 'nucleotide
metabolism', 'signal transduction'.
- Metabolic pathways.
- Are any common pathways missing?
- Are there 'gaps' (missing enzymes) in some pathways?
- Compare identified pathways with the life style of the
organism.
- Evolutionary history
- Internal genome duplications can sometimes be detected.
- Gene decay can sometimes be characterized: genes that are on
their 'way out' after duplication, or because the life style of the
organism has changed.
- Horisontal gene transfer: genes that have been acquired from
another organism.
Copyright © 2001
Per Kraulis
$Date: 2001/01/26 15:00:04 $