Fiveable

👨‍👩‍👦‍👦General Genetics Unit 12 Review

QR code for General Genetics practice questions

12.3 Genome Evolution and Speciation

12.3 Genome Evolution and Speciation

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
👨‍👩‍👦‍👦General Genetics
Unit & Topic Study Guides

Genome evolution shapes the genetic makeup of organisms over time. Mutations, duplications, and rearrangements drive changes in DNA sequences and structure, while natural selection and genetic drift determine how those changes spread through populations. Together, these forces lead to adaptation and, eventually, speciation.

Comparative genomics helps unravel this evolutionary history by examining similarities and differences between species' genomes. By analyzing orthologous genes, synteny, and phylogenetic relationships, scientists reconstruct evolutionary trees and estimate when lineages diverged. This unit ties together the molecular mechanisms of genome change with the larger-scale process of how new species arise.

Genome Evolution

Mechanisms of genome evolution

Three broad categories of change drive genome evolution: mutations, duplications, and rearrangements. Each operates at a different scale, from single nucleotides to entire genomes.

Mutations introduce changes in DNA sequence.

  • Point mutations alter single nucleotides:
    • Substitutions replace one nucleotide with another. Transitions swap a purine for a purine (A↔G) or a pyrimidine for a pyrimidine (C↔T). Transversions swap a purine for a pyrimidine or vice versa. Transitions are more common than transversions.
    • Insertions and deletions (indels) add or remove nucleotides. When the number of inserted or deleted bases isn't a multiple of three, they cause frameshift mutations that alter the entire downstream reading frame.
  • Chromosomal mutations rearrange larger segments of DNA:
    • Inversions reverse the orientation of a segment. A paracentric inversion doesn't include the centromere; a pericentric inversion does.
    • Translocations move segments between non-homologous chromosomes. Reciprocal translocations exchange segments between two chromosomes. Robertsonian translocations fuse two acrocentric chromosomes at their centromeres, reducing chromosome number by one.

Duplications increase the copy number of DNA sequences.

  • Gene duplications create extra copies of individual genes. Tandem duplications place the new copy adjacent to the original (in head-to-tail, head-to-head, or tail-to-tail orientation). These are a major source of raw material for evolving new gene functions.
  • Whole-genome duplications (polyploidy) duplicate the entire genome. Autopolyploidy results from duplication within one species; allopolyploidy results from hybridization between two species followed by genome doubling. Polyploidy has been especially important in plant evolution.
  • Segmental duplications copy large DNA regions (>1 kb) to new chromosomal locations. These can predispose regions to further rearrangement through unequal crossing over.

Rearrangements alter genome structure and organization.

  • Transposable elements (TEs) are mobile DNA sequences that move within genomes. They make up a huge fraction of many eukaryotic genomes (about 45% of the human genome).
    • DNA transposons move via a cut-and-paste mechanism using transposase. They can be autonomous (encode their own transposase) or non-autonomous (depend on another element's transposase).
    • Retrotransposons move via a copy-and-paste mechanism through an RNA intermediate. LTR retrotransposons have long terminal repeats flanking the element. Non-LTR retrotransposons include LINEs (long interspersed nuclear elements, which are autonomous) and SINEs (short interspersed nuclear elements, which are non-autonomous and depend on LINE-encoded machinery).
  • Recombination events exchange DNA between chromosomes:
    • Homologous recombination occurs between similar sequences, most notably during crossing over in meiosis. It shuffles alleles but generally preserves genome structure.
    • Non-homologous end joining (NHEJ) repairs double-strand breaks by joining dissimilar sequences. Because it doesn't require a homologous template, NHEJ is error-prone and can introduce small deletions or insertions at the repair site.
Mechanisms of genome evolution, Types of Mutations – Mt Hood Community College Biology 102

Natural selection and genetic drift

Once mutations arise, two main forces determine their fate in a population: natural selection (non-random) and genetic drift (random).

Natural selection favors or disfavors genotypes based on their effect on fitness.

  • Positive selection increases the frequency of advantageous alleles. A classic example is the evolution of antibiotic resistance in bacteria, where exposure to a drug strongly favors resistant mutants. A selective sweep occurs when a beneficial allele rises rapidly in frequency, dragging nearby linked variants along with it. Lactase persistence in human populations with a history of dairy farming is a well-studied selective sweep.
  • Negative (purifying) selection removes deleterious alleles. Most new mutations that affect protein function are harmful and are gradually eliminated. This is why essential genes tend to be highly conserved across species.
  • Balancing selection maintains multiple alleles in a population. Heterozygote advantage is one mechanism: individuals heterozygous for the sickle cell allele (HbAS) have increased resistance to malaria, so both the normal and sickle alleles persist in malaria-endemic regions. Frequency-dependent selection is another mechanism, where rare alleles have a fitness advantage precisely because they are rare (e.g., self-incompatibility alleles in plants, which prevent inbreeding).

Genetic drift causes random fluctuations in allele frequencies, especially in small populations.

  • The founder effect occurs when a small group colonizes a new area, carrying only a subset of the original population's genetic diversity. This can make rare alleles common in the new population.
  • The bottleneck effect occurs when a population crashes temporarily, randomly eliminating alleles regardless of their fitness value. Cheetahs, for example, show extremely low genetic diversity likely due to a historical bottleneck.
  • Effective population size (NeN_e) determines how strongly drift operates. Smaller NeN_e means stronger drift. NeN_e is often much smaller than the census population size because of unequal sex ratios, variation in reproductive success, and fluctuating population size.
Mechanisms of genome evolution, DNA Mutations | Biology for Majors I

Speciation and Comparative Genomics

Genome evolution and speciation

Speciation occurs when populations become reproductively isolated and accumulate enough genetic differences that they can no longer interbreed. Two categories of barriers drive this process.

Prezygotic barriers prevent hybrid zygotes from forming in the first place:

  • Ecological isolation: populations occupy different habitats or niches (e.g., insect species that specialize on different host plants)
  • Behavioral isolation: populations differ in mating signals or preferences (e.g., distinct courtship songs in bird species)
  • Temporal isolation: populations breed at different times (e.g., plant species with non-overlapping flowering seasons)

Postzygotic barriers reduce the fitness of hybrids that do form:

  • Hybrid inviability: hybrids fail to develop normally or have reduced survival (e.g., certain Drosophila species crosses)
  • Hybrid sterility: hybrids survive but cannot reproduce. The mule (horse × donkey) is the textbook example: it's viable but sterile because the parental chromosomes can't pair properly during meiosis.

Speciation modes describe the geographic context in which divergence occurs.

  • Allopatric speciation happens when a physical barrier (mountain range, river, ocean) splits a population. Gene flow stops, and the isolated populations diverge independently through mutation, drift, and selection. Darwin's finches on the Galápagos Islands are a classic example: different islands provided the geographic separation that allowed distinct species to evolve.
  • Sympatric speciation happens within a single geographic area, without physical separation. Polyploidy can create instant reproductive isolation because a newly polyploid individual can't produce fertile offspring with diploid members of the parent species. This has been documented in wheat, cotton, and many other plant lineages. Ecological speciation occurs when populations adapt to different niches in the same area. The apple maggot fly (Rhagoletis pomonella) is a well-known case: some populations shifted from hawthorn fruits to apples, and the two host races now show partial reproductive isolation due to different fruiting times.

Genomic divergence accumulates as isolation continues.

  • DNA sequence differences build up through mutation, drift, and divergent selection.
  • Dobzhansky-Muller incompatibilities arise when genes that interact with each other evolve independently in separate populations. Each change may be fine in its own genetic background, but when the two populations hybridize, the novel combination of alleles causes problems (reduced fitness or sterility in hybrids).
  • Reinforcement occurs when natural selection strengthens prezygotic barriers in zones where two partially isolated populations come back into contact. If hybrids are less fit, selection favors individuals that preferentially mate with their own type. Character displacement in Galápagos finch beak size is a frequently cited example.

Comparative genomics for evolutionary history

Comparative genomics provides the tools to trace how genomes have changed since species diverged. Several key concepts underpin this work.

Orthologs vs. paralogs: distinguishing these is fundamental to comparative genomics.

  • Orthologs are genes in different species that descended from a single gene in their last common ancestor, separated by a speciation event. Orthologs typically retain similar functions. For example, hemoglobin genes across vertebrates are orthologs. Comparing orthologs is how you reconstruct phylogenetic trees and identify conserved (functionally important) regions.
  • Paralogs are genes within the same genome that arose from a duplication event. After duplication, paralogs are free to diverge. One copy may keep the original function while the other evolves a new one (neofunctionalization), or both copies may split the original function between them (subfunctionalization). The globin gene family (with its α-globin and β-globin clusters) is a classic example of paralog diversification.

To keep these straight: orthologs = separated by speciation; paralogs = separated by duplication.

Synteny refers to the conservation of gene order and orientation between species.

  • Syntenic regions were inherited from a common ancestor and have remained intact. Comparing human and mouse genomes, for instance, reveals large blocks of synteny despite ~80 million years of divergence.
  • Synteny breakpoints mark where chromosomal rearrangements (inversions, translocations) disrupted the ancestral gene order. Mapping these breakpoints across species can reveal which rearrangements occurred on which lineages, and some breakpoints have been associated with speciation events (well-studied in Drosophila species).

Phylogenetic analysis reconstructs evolutionary relationships from genetic (or morphological) data. Three major approaches are used:

  1. Maximum parsimony finds the tree requiring the fewest evolutionary changes. It works well when evolution is slow relative to speciation, but can be misled by convergent evolution (homoplasy).
  2. Maximum likelihood finds the tree that maximizes the probability of the observed data under a specific model of sequence evolution. It incorporates branch lengths and handles rate variation better than parsimony.
  3. Bayesian inference calculates the posterior probability of each possible tree given the data and prior assumptions. It naturally incorporates uncertainty and produces probability values for each branch, making it popular for modern phylogenomic studies.

Molecular clocks estimate when evolutionary events occurred based on the rate at which sequences change.

  • You calibrate a molecular clock by anchoring it to events with known dates, such as fossils or well-dated geological events. For example, the radiation of modern mammalian orders has been timed using the K-Pg (Cretaceous-Paleogene) mass extinction at ~66 million years ago as a calibration point.
  • The basic assumption is that mutations accumulate at a roughly constant rate over time. In reality, rates vary across lineages and genes, so relaxed clock models have been developed to allow rate variation. Always consider whether the constant-rate assumption holds for your dataset.