The phylogeography, epidemiology and determinants of Maize streak virus dispersal across Africa and the adjacent Indian Ocean Islands
Maize streak disease (MSD), caused by variants of the Maize streak virus (MSV) A strain, is the world's third and Africa’s most important maize foliar disease. Outbreaks of the disease occur frequently and in an erratic fashion across Africa and Islands in the Indian Ocean causing devastating yield losses such that the emergence, resurgence and rapid diffusion of MSV-A variants in this region presents a serious threat to maize production, farmer livelihoods and food security. To compliment current MSD management systems, a total of 689 MSV-A full genomes sampled over a 32 year period (1979-2011) from 20 countries across Africa and the adjacent Indian Ocean Islands, 286 of which were novel, were used to estimate: (i) the levels of genetic diversity using MEGA and the Sequence Demarcation Tool v1.2 (SDT); (ii) the times of occurrence and distribution of recombination using the recombination detection program (RDP v.4) and the genetic algorithm for recombination detection (GARD); (iii) selection pressure on codon positions using PARRIS and FUBAR methods implemented on the DATAMONKEY web server; (iv) reconstruct the history of spatio-temporal diffusion for MSV-A using the discrete phylogeographic models implemented in BEAST v1.8.1; (v) characterize source-sink dynamics and identify predictor variables driving MSV-A dispersal using the generalized linear models, again implemented in BEAST v1.8.1. Isolates used displayed low levels of genetic diversity (0.017 mean pairwise distance and ≥ 98% nucleotide sequence identities), and a well-structured geographical distribution where all of the 233 novel isolates clustered together with the -A1 strains. A total of 34 MSV inter-strain recombination events and 33 MSV-A intra-strain recombination events, 15 of which have not been reported in previous analyses (Owor et al., 2007, Varsani et al., 2008 and Monjane et al., 2011), were detected. The majority of intra-strain MSV-A recombination events detected were inferred to have occurred within the last six decades, the oldest and most conserved of these being events 19, 26 and 28 whereas the most recent events were 8, 16, 17, 21, 23, and 29. Intra-strain recombination events 20, 25 and 33, were widely distributed amongst East African MSV-A samples, whereas events 16, 21 and 23, occurred more frequently within West African MSV-A samples. Events 1, 4, 8, 10, 14, 17, 19, 22, 24, 25, 26, 28, and 29 were more widely distributed across East, West and Southern Africa and the adjacent Indian Ocean Islands. Whereas codon positions 12 and 19 within motif I in the coat protein transcript, and four out of the seven codon positions (147, 166, 195, 203, 242, 262, 267) in the Rep transcript (codons 195 and 203 in the Rb motif and codons 262 and 267 in site B of motif IV), evolved under strong positive selection pressure, those in the movement protein (MP) and RepA protein encoding genes evolved neutrally and under negative selection pressure respectively. Phylogeographic analyses revealed that MSV-A first emerged in Zimbabwe around 1938 (95% HPD 1904 - 1956), and its dispersal across Africa and the adjacent Indian Ocean Islands was achieved through approximately 34 migration events, 19 of which were statistically supported using Bayes factor (BF) tests. The higher than previously reported mean nucleotide substitution rate [9.922 × 10-4 (95% HPD 8.54 × 10-4 to 1.1317 × 10-3) substitutions per site per year)] for the full genome recombination-free MSV-A dataset H estimated was possibly a result of high nucleotide substitution rates being conserved among geminiviruses such as MSV as previously suggested. Persistence of MSV-A was highest in source locations that include Zimbabwe, followed by South Africa, Uganda, and Kenya. These locations were characterized by high average annual precipitation; moderately high average annual temperatures; high seasonal changes; high maize yield; high prevalence of undernourishment; low trade imports and exports; high GDP per capita; low vector control pesticide usage; high percentage forest land area; low percentage arable land; high population densities, and were in close proximity to sink locations. Dispersal of MSV-A was frequent between locations that received high average annual rainfall, had high percentage forest land area, occupied high latitudes and experienced similar climatic seasons, had high GDP per capita and had balanced maize import to export ratios, and were in close geographical proximity.