The importance of being genomic: Non-coding and coding sequences suggest different models of toxin multi-gene family evolution
Research output: Contribution to journal › Article › peer-review
Standard Standard
In: Toxicon, Vol. 107, No. Part B, 07.09.2015, p. 344-358.
Research output: Contribution to journal › Article › peer-review
HarvardHarvard
APA
CBE
MLA
VancouverVancouver
Author
RIS
TY - JOUR
T1 - The importance of being genomic: Non-coding and coding sequences suggest different models of toxin multi-gene family evolution
AU - Malhotra, A.
AU - Creer, S.
AU - Harris, J.B.
AU - Thorpe, R.S.
PY - 2015/9/7
Y1 - 2015/9/7
N2 - Studies of multi-gene protein families, including many toxins, are crucial for understanding the role of gene duplication in generating protein diversity in general. However, many evolutionary analyses of gene families are based on coding sequences, and do not take into account many potentially confounding evolutionary factors, such as recombination and convergence due to selection. We illustrate this using snake venom gene sequences from the Phospholipase A2 (PLA2) subfamily. Novel gene sequences from 20 species of understudied Asian pitvipers were analyzed alongside available genomic PLA2 sequences from another four crotaline and several viperine species. In contrast to previous analyses of this toxin family based on cDNA sequences, we find that duplication events are concentrated at the tips of the tree, suggesting that major functions such as presynaptic neurotoxicity have evolved convergently multiple times in pitvipers. We provide evidence that this discrepancy is due to differing evolutionary patterns between introns and exons. The effects of several well-known sources of bias on the phylogeny were small, compared to the effect of analyses based on different partitions of the gene (whole gene sequence, non-coding regions, cDNA sequence). Switches of function were found to be largely associated with strong selection, and with duplication events. Use of coding sequences for phylogeny estimation potentially produces incorrect inferences about the action of selection on individual lineages and sites. Our results have major implications for phylogenomic methods of functional inference as well as for our understanding of the evolution of multigene families.
AB - Studies of multi-gene protein families, including many toxins, are crucial for understanding the role of gene duplication in generating protein diversity in general. However, many evolutionary analyses of gene families are based on coding sequences, and do not take into account many potentially confounding evolutionary factors, such as recombination and convergence due to selection. We illustrate this using snake venom gene sequences from the Phospholipase A2 (PLA2) subfamily. Novel gene sequences from 20 species of understudied Asian pitvipers were analyzed alongside available genomic PLA2 sequences from another four crotaline and several viperine species. In contrast to previous analyses of this toxin family based on cDNA sequences, we find that duplication events are concentrated at the tips of the tree, suggesting that major functions such as presynaptic neurotoxicity have evolved convergently multiple times in pitvipers. We provide evidence that this discrepancy is due to differing evolutionary patterns between introns and exons. The effects of several well-known sources of bias on the phylogeny were small, compared to the effect of analyses based on different partitions of the gene (whole gene sequence, non-coding regions, cDNA sequence). Switches of function were found to be largely associated with strong selection, and with duplication events. Use of coding sequences for phylogeny estimation potentially produces incorrect inferences about the action of selection on individual lineages and sites. Our results have major implications for phylogenomic methods of functional inference as well as for our understanding of the evolution of multigene families.
U2 - 10.1016/j.toxicon.2015.08.009
DO - 10.1016/j.toxicon.2015.08.009
M3 - Article
VL - 107
SP - 344
EP - 358
JO - Toxicon
JF - Toxicon
SN - 0041-0101
IS - Part B
ER -