Population structure, phylogenetic signal, recombination vs. mutations

Recently, I reviewed several concepts about recombination and mutation in bacterial genomes when I was revising my manuscript of GeneMates. In this post, I summarise my understandings to two groups of terms and two measures (r/m and ρ/θ) that are relevant to these biological events, and tabulate values of these measures in six bacterial species.


Relatedness between bacterial isolates

  • Population structure or population stratification: systematic genetic variation between groups of individuals due to different ancestry [1–3]. The genetic variation arises from mutations and recombination occurred in bacterial evolutionary history [4]. Notably, recombination is not accounted for by RAxML, PhyML or FastTree [5], resulting in biased tree estimates [6].

  • Phylogenetic signal or phylogenetic relatedness: the tendency for phylogenetically related organisms to resemble each other, and vice versa [7]. These two concepts are similar to population structure for bacterial genomes.

Relative impacts of recombination and point mutations on genome divergence

  • r/m: relative substitution rate due to recombination versus mutations [8], or relative probability that a nucleotide substitution results from recombination relative to point mutations, which directly measures the relative impact of recombination on genetic variation [9, 10]. Symbols: r (rate of nucleotide substitutions resulting from recombination), m (rate of nucleotide substitutions resulting from point mutations). r/m can be calculated using LDhat, ClonalFrame and eBURST [8, 9].

  • ρ/θ = γ/μ: relative frequency at which recombination occurs relative to point mutations [9], which can be estimated using ClonalFrame [5, 11]. This measure ignores lengths and nucleotides of imported DNA fragments. Symbols: ρ (recombination parameter 4Neγ), θ (mutation parameter 4Neμ), where Ne is the effective population size, γ is the recombination rate and μ is the point mutation rate, per site per generation [4, 12]. The level of recombination can be considered moderate when ρ/θ = 0.5 [13].

Table: Overall or group specific r/m and ρ/θ estimates for six bacterial species and citations. MLST and genome-wide: sources of DNA sequences from which the ratios are estimated.

Species r/m ρ/θ
Salmonella enterica 0.2 – 2.95 (genome-wide, lineage specific) [11] 0.37 (genome-wide) [11]
Neisseria meningitidis 7.1 (MLST) [9] 0.08 – 2.66 (MLST) [12]
Listeria monocytogenes 0.66 – 4.42 (MLST) [14] 0.13 – 0.71 (MLST) [14]
Escherichia coli 0.91 – 12 (MLST, clone specific) [9] 0.33 – 5.55 (MLST, lineage specific) [15]
Klebsiella pneumoniae 0.3 (MLST) [9] 0.42 (MLST) [16]
Staphylococcus aureus 0.1 (MLST) [9], 0.83 (genome-wide) [8] 0.49 (MLST) [13]


In general, both r/m and ρ/θ vary extensively across species and lineages of the same species [13], and recombination drives genome innovation more efficiently than point mutations in E. coli [17], N. meningitidis [18], several lineages of S. enterica and L. monocytogenes [11, 14], and some E. coli clones [15]. Therefore, the ability of a phylogenetic tree in capturing all genetic variation that results from the whole evolutionary history is limited when the effect of recombination is prominent.


  1. Cardon LR, Palmer LJ: Population stratification and spurious allelic association. Lancet 2003, 361:598–604.
  2. Freedman ML, Reich D, Penney KL, McDonald GJ, Mignault AA, Patterson N, Gabriel SB, Topol EJ, Smoller JW, Pato CN, Pato MT, Petryshen TL, Kolonel LN, Lander ES, Sklar P, Henderson B, Hirschhorn JN, Altshuler D: Assessing the impact of population stratification on genetic association studies. Nat Genet 2004, 36:388.
  3. Astle W, Balding D: Population Structure and Cryptic Relatedness in Genetic Association Studies. Stat Sci 2009, 24:451–471.
  4. Morrell PL, Toleno DM, Lundy KE, Clegg MT: Estimating the contribution of mutation, recombination and gene conversion in the generation of haplotypic diversity. Genetics 2006, 173:1705–1723.
  5. Didelot X, Wilson DJ: ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes. PLoS Comput Biol 2015, 11:e1004041.
  6. Schierup MH, Hein J: Consequences of recombination on traditional phylogenetic analysis. Genetics 2000, 156:879–891.
  7. Blomberg SP, Garland JR. T, Ives AR: Testing for phylogenetic signal in comparative data: behavioral traits are more labile. Evolution (N Y) 2003, 57:717–745.
  8. Everitt RG, Didelot X, Batty EM, Miller RR, Knox K, Young BC, Bowden R, Auton A, Votintseva A, Larner-Svensson H, Charlesworth J, Golubchik T, Ip CLC, Godwin H, Fung R, Peto TEA, Walker AS, Crook DW, Wilson DJ: Mobile elements drive recombination hotspots in the core genome of Staphylococcus aureus. Nat Commun 2014, 5:3956.
  9. Vos M, Didelot X: A comparison of homologous recombination rates in bacteria and archaea. Isme J 2008, 3:199.
  10. Feil EJ, Holmes EC, Bessen DE, Chan M-S, Day NPJ, Enright MC, Goldstein R, Hood DW, Kalia A, Moore CE, Zhou J, Spratt BG: Recombination within natural populations of pathogenic bacteria: Short-term empirical estimates and long-term phylogenetic consequences. Proc Natl Acad Sci 2001, 98:182 LP-187.
  11. Didelot X, Bowden R, Street T, Golubchik T, Spencer C, McVean G, Sangal V, Anjum MF, Achtman M, Falush D, Donnelly P: Recombination and Population Structure in Salmonella enterica. PLOS Genet 2011, 7:e1002191.
  12. Wilson DJ, Mcvean G, Jolley KA, Maiden MCJ, Kriz P: The Influence of Mutation, Recombination, Population History, and Selection on Patterns of Genetic Diversity in Neisseria meningitidis. Mol Biol Evol 2004, 22:562–569.
  13. Hanage WP, Fraser C, Spratt BG: The impact of homologous recombination on the generation of diversity in bacteria. J Theor Biol 2006, 239:210–219.
  14. den Bakker HC, Didelot X, Fortes ED, Nightingale KK, Wiedmann M: Lineage specific recombination rates and microevolution in Listeria monocytogenes. BMC Evol Biol 2008, 8:277.
  15. Chattaway MA, Jenkins C, Rajendram D, Cravioto A, Talukder KA, Dallman T, Underwood A, Platt S, Okeke IN, Wain J: Enteroaggregative Escherichia coli Have Evolved Independently as Distinct Complexes within the E. coli Population with Varying Ability to Cause Disease. PLoS One 2014, 9:e112967.
  16. Guo C, Yang X, Wu Y, Yang H, Han Y, Yang R, Hu L, Cui Y, Zhou D: MLST-based inference of genetic diversity and population structure of clinical Klebsiella pneumoniae, China. Sci Rep 2015, 5:7612.
  17. Guttman DS, Dykhuizen DE: Clonal divergence in Escherichia coli as a result of recombination, not mutation. Science (80- ) 1994, 266:1380 LP-1383.
  18. Spratt BG, Feil EJ, Achtman M, Maiden MC: The relative contributions of recombination and mutation to the divergence of clones of Neisseria meningitidis. Mol Biol Evol 1999, 16:1496–1502.