2009-07-01

F: {alignments} -> {tree topologies} is continuous

If the title is not clear enough, it means that the function that maps alignments of genetic information to phylogenetic tree topologies is continuous. Well, kind of. It's a scientific proof, not a mathematical one, so it's based on evidence. But as far as evidence goes, it makes a compelling argument in favor of the thesis.

The paper is Wong, et al, 2008 (Science) "Alignment Uncertainty and Genomic Analysis". DOI: 10.1126/science.1151532 .

The relevance IMHO of this paper is not to estabilish that 'common sense' is usually correct (although it is important in its own matter to know when it's possible to infer things solely from ituition). The authors make the case for interpreting alignments themselves as random variables, and in doing so, they conclude that (in a very precise way) small variations in alignments produce small variations in tree topologies inferred from those alignments. More so, they indicate this result is robust in respect to methods of inferring phylogenetic trees.

For the mathematically inclined, they defined a metric in the set of alignments, a metric in tree topologies, generated alignments (via MCMC) that were near the reference alignment, and observed that the resulting trees all had topologies near the reference topology in the given metric.

This CPU intensive work was done for alignments of whole genomes, with several alignment techniques and tree inferring algorithms, with similar results for most cases. Definitely worth a read.

(UPDATE: fixed title typo)

No comments:

Post a Comment