Mallard can handle 16S rRNA gene sequences of varying length;
however if partial gene sequences are used, ensure that all are roughly
of the same length and cover roughly the same region (Fig. 1).
Figure 1. An
example of partial sequences of
roughly the same
length and region of the 16S rRNA gene sequence.
Avoid datasets containing partial sequences with insufficient
homologous base positions (Fig. 2).
Figure 2. An
example of a dataset with
insufficient overlap
between sequences.
Similarly, avoid too large a variation in sequence length, for
example a mixture of partial and full-length sequences (Fig. 3).
Figure 3. An
example of sequences varying too
greatly in size.
In both cases the resulting alignment will be inaccurate which
in turn will result in inaccurate results from the Mallard program.