duck

Tajima's D and Site-specific Nucleotide Frequency in a Population During an Infectious Disease Outbreak

Omori R, Wu J
SIAM J. Appl. Math, 77 (6) 2156-2171. doi: 10.1137/17M1114946

Abstract

Tajima’s D measures the difference between two estimates of genetic diversity in a given set of nucleic acid sequences. Here we show how Tajima’s D can be calculated/estimated by developing an inductive algorithm for calculating the site-specific nucleotide frequencies from a standard multistrain susceptible-infective-removed (SIR) model (both deterministic and stochastic). We show that these frequencies are fully determined by the mutation rate and the initial condition of the frequencies. We prove that the sign of Tajima’s D is independent of the disease population dynamics and that the negative sign does not imply an expansion of the infected population in the deterministic model. Using individual-based simulations, we also show that dependence of Tajima’s D on the disease transmission and evolution dynamics is a result of the stochasticity of those dynamics. The same is true for the dependence related to genetic diversity of a pathogen.