Standard Standard

Impact of Phylogenetic Tree Completeness and Misspecification of Sampling Fractions on Trait Dependent Diversification Models. / Mynard, Poppy; Algar, Adam ; Lancaster, Lesley T. et al.
In: Systematic Biology, Vol. 72, No. 1, 19.05.2023, p. 106-119.

Research output: Contribution to journalArticlepeer-review

HarvardHarvard

Mynard, P, Algar, A, Lancaster, LT, Bocedi, G, Fahri, F, Gubry-Rangin, C, Lupiyaningdyah, P, Nangoy, M, Osborne, O, Papadopulos, AST, Sudiana, IM, Juliandi, B, Travis, J & Herrera-Alsina, L 2023, 'Impact of Phylogenetic Tree Completeness and Misspecification of Sampling Fractions on Trait Dependent Diversification Models', Systematic Biology, vol. 72, no. 1, pp. 106-119. https://doi.org/10.1093/sysbio/syad001

APA

Mynard, P., Algar, A., Lancaster, L. T., Bocedi, G., Fahri, F., Gubry-Rangin, C., Lupiyaningdyah, P., Nangoy, M., Osborne, O., Papadopulos, A. S. T., Sudiana, I. M., Juliandi, B., Travis, J., & Herrera-Alsina, L. (2023). Impact of Phylogenetic Tree Completeness and Misspecification of Sampling Fractions on Trait Dependent Diversification Models. Systematic Biology, 72(1), 106-119. https://doi.org/10.1093/sysbio/syad001

CBE

Mynard P, Algar A, Lancaster LT, Bocedi G, Fahri F, Gubry-Rangin C, Lupiyaningdyah P, Nangoy M, Osborne O, Papadopulos AST, et al. 2023. Impact of Phylogenetic Tree Completeness and Misspecification of Sampling Fractions on Trait Dependent Diversification Models. Systematic Biology. 72(1):106-119. https://doi.org/10.1093/sysbio/syad001

MLA

VancouverVancouver

Mynard P, Algar A, Lancaster LT, Bocedi G, Fahri F, Gubry-Rangin C et al. Impact of Phylogenetic Tree Completeness and Misspecification of Sampling Fractions on Trait Dependent Diversification Models. Systematic Biology. 2023 May 19;72(1):106-119. Epub 2023 Jan 16. doi: 10.1093/sysbio/syad001

Author

Mynard, Poppy ; Algar, Adam ; Lancaster, Lesley T. et al. / Impact of Phylogenetic Tree Completeness and Misspecification of Sampling Fractions on Trait Dependent Diversification Models. In: Systematic Biology. 2023 ; Vol. 72, No. 1. pp. 106-119.

RIS

TY - JOUR

T1 - Impact of Phylogenetic Tree Completeness and Misspecification of Sampling Fractions on Trait Dependent Diversification Models

AU - Mynard, Poppy

AU - Algar, Adam

AU - Lancaster, Lesley T.

AU - Bocedi, Greta

AU - Fahri, Fahri

AU - Gubry-Rangin, Cecile

AU - Lupiyaningdyah, Pungki

AU - Nangoy, Meis

AU - Osborne, Owen

AU - Papadopulos, Alexander S. T.

AU - Sudiana, I Made

AU - Juliandi, Berry

AU - Travis, Justin

AU - Herrera-Alsina, Leonel

N1 - © The Author(s) 2023. Published by Oxford University Press on behalf of the Society of Systematic Biologists.

PY - 2023/5/19

Y1 - 2023/5/19

N2 - Understanding the origins of diversity and the factors that drive some clades to be more diverse than others are important issues in evolutionary biology. Sophisticated SSE (state-dependent speciation and extinction) models provide insights into the association between diversification rates and the evolution of a trait. The empirical data used in SSE models and other methods is normally imperfect, yet little is known about how this can affect these models. Here, we evaluate the impact of common phylogenetic issues on inferences drawn from SSE models. Using simulated phylogenetic trees and trait information, we fitted SSE models to determine the effects of sampling fraction (phylogenetic tree completeness) and sampling fraction mis-specification on model selection and parameter estimation (speciation, extinction, and transition rates) under two sampling regimes (random and taxonomically biased). As expected, we found that both model selection and parameter estimate accuracies are reduced at lower sampling fractions (i.e., low tree completeness). Furthermore, when sampling of the tree is imbalanced across sub-clades and tree completeness is ≤ 60%, rates of false positives increase and parameter estimates are less accurate, compared to when sampling is random. Thus, when applying SSE methods to empirical datasets, there are increased risks of false inferences of trait dependent diversification when some sub-clades are heavily under-sampled. Mis-specifying the sampling fraction severely affected the accuracy of parameter estimates: parameter values were over-estimated when the sampling fraction was specified as lower than its true value, and under-estimated when the sampling fraction was specified as higher than its true value. Our results suggest that it is better to cautiously under-estimate sampling efforts, as false positives increased when the sampling fraction was over-estimated. We encourage SSE studies where the sampling fraction can be reasonably estimated and provide recommended best practices for SSE modeling. [Trait dependent diversification; SSE models; phylogenetic tree completeness; sampling fraction.].

AB - Understanding the origins of diversity and the factors that drive some clades to be more diverse than others are important issues in evolutionary biology. Sophisticated SSE (state-dependent speciation and extinction) models provide insights into the association between diversification rates and the evolution of a trait. The empirical data used in SSE models and other methods is normally imperfect, yet little is known about how this can affect these models. Here, we evaluate the impact of common phylogenetic issues on inferences drawn from SSE models. Using simulated phylogenetic trees and trait information, we fitted SSE models to determine the effects of sampling fraction (phylogenetic tree completeness) and sampling fraction mis-specification on model selection and parameter estimation (speciation, extinction, and transition rates) under two sampling regimes (random and taxonomically biased). As expected, we found that both model selection and parameter estimate accuracies are reduced at lower sampling fractions (i.e., low tree completeness). Furthermore, when sampling of the tree is imbalanced across sub-clades and tree completeness is ≤ 60%, rates of false positives increase and parameter estimates are less accurate, compared to when sampling is random. Thus, when applying SSE methods to empirical datasets, there are increased risks of false inferences of trait dependent diversification when some sub-clades are heavily under-sampled. Mis-specifying the sampling fraction severely affected the accuracy of parameter estimates: parameter values were over-estimated when the sampling fraction was specified as lower than its true value, and under-estimated when the sampling fraction was specified as higher than its true value. Our results suggest that it is better to cautiously under-estimate sampling efforts, as false positives increased when the sampling fraction was over-estimated. We encourage SSE studies where the sampling fraction can be reasonably estimated and provide recommended best practices for SSE modeling. [Trait dependent diversification; SSE models; phylogenetic tree completeness; sampling fraction.].

KW - Genetic Speciation

KW - Phenotype

KW - Phylogeny

U2 - 10.1093/sysbio/syad001

DO - 10.1093/sysbio/syad001

M3 - Article

C2 - 36645380

VL - 72

SP - 106

EP - 119

JO - Systematic Biology

JF - Systematic Biology

SN - 1063-5157

IS - 1

ER -