Bioinformatics toolkit
www.cardiff.ac.uk/biosi/research/biosoft/

Partial Sequences


Mallard can handle 16S rRNA gene sequences of varying length; however if partial gene sequences are used, ensure that all are roughly of the same length and cover roughly the same region (Fig. 1).
Figure 1. An example of partial sequences of roughly the same length and region of the 16S rRNA gene sequence.
Partial sequences of roughly equivalent length and region.
Avoid datasets containing partial sequences with insufficient homologous base positions (Fig. 2).
Figure 2. An example of a dataset with insufficient overlap between sequences.
Similarly, avoid too large a variation in sequence length, for example a mixture of partial and full-length sequences (Fig. 3).
Figure 3. An example of sequences varying too greatly in size.
In both cases the resulting alignment will be inaccurate which in turn will result in inaccurate results from the Mallard program.

A quick way of checking a dataset of 16S rRNA gene sequences prior to alignment is to use the OrientationChecker tool, available from www.cardiff.ac.uk/biosi/research/biosoft/.


Index | Toolkit website

Dr K.E. Ashelford. © 2006, Cardiff University