Phylogenetics Homework 5

Consider the protein and DNA sequence evolution of the 6-phosphogluconate dehydrogenase, decarboxylating (gnd) gene in the following organisms:

Encephalitozoon cuniculi (Fungus)
Dictyostelium discoideum (Slime mold)
Synechococcus elongatus (Cyanobacteria)
Cronobacter dublinensis (Proteobacteria)
Candidatus Solibacter usitatus (Acidobacteria)
Leptotrichia goodfellowii (Fusobacteria)
Borrelia garinii (Spirochaetes)
Phycisphaera mikurensis (Planctomycetes)
Mycobacterium tuberculosis (Actinobacteria)
Anaerolinea thermophila (Chloroflexi)
Chlamydia trachomatis (Chlamydiae)
Lactococcus lactis (Firmicutes)

  1. Using the two best fitting DNA models and two best fitting protein models, find the corresponding distance matrices, if possible. If this is not possible, find the closest models for which it is possible to form the distance matrices. Build phylogenies for the following species using the neighbor joining algorithm on the distances. Also build the phylogeny using LogDet distances. Tell which software package you used to conduct these analyses. MEGA, ape in R, and Phylip are all possibilities.
  2. Are all your phylogenies the same? Are there some fundamental differences between the four phylogenies that you have? If there are differences, how might those differences have come about in this analysis? (For instance, a model that counts all changes the same and a model that says certain changes are much more likely than others might well give very different distances between species. DNA alignments that differ greatly from their protein conterparts might also give very different distances. What reasons do you think exist in your data and analysis for the differences you see?)
  3. Is there anything "strange" about the phylogenies? If so, what is it and how might it have occured?