Molecular Biology

🧬Molecular Biology Unit 10 – Molecular Evolution and Phylogenetics

Molecular evolution and phylogenetics explore how DNA, RNA, and proteins change over time. These fields study the forces shaping genetic variation, like mutation and natural selection, and use tools like DNA sequencing to uncover evolutionary relationships. Researchers use various methods to build phylogenetic trees, representing how species are related. These techniques, from distance-based approaches to complex algorithms, help scientists understand evolutionary history, adapt to new diseases, and even engineer proteins for industrial use.

Got a Unit Test this week?

we crunched the numbers and here's the most likely topics on your next test

Key Concepts in Molecular Evolution

  • Molecular evolution studies evolutionary processes and patterns at the molecular level, including changes in DNA, RNA, and proteins over time
  • Neutral theory of molecular evolution proposes that most genetic changes at the molecular level are caused by genetic drift of mutant alleles that are neutral (do not affect fitness)
  • Molecular clock hypothesis suggests that the rate of molecular evolution is approximately constant over time, allowing for estimating divergence times between species
    • Molecular clocks can be used to date phylogenetic trees and infer evolutionary timescales (e.g., estimating the divergence time between humans and chimpanzees)
  • Evolutionary forces shaping molecular evolution include mutation, natural selection, genetic drift, and gene flow
  • Positive selection occurs when a mutation confers a fitness advantage and increases in frequency within a population (e.g., antibiotic resistance in bacteria)
  • Purifying selection removes deleterious mutations from a population, maintaining the functionality of genes and proteins
  • Molecular markers, such as single nucleotide polymorphisms (SNPs) and microsatellites, are used to study genetic variation and evolutionary relationships

Mechanisms of Molecular Change

  • Mutations are changes in the DNA sequence and serve as the primary source of genetic variation
    • Point mutations involve single nucleotide changes and can be classified as transitions (purine to purine or pyrimidine to pyrimidine) or transversions (purine to pyrimidine or vice versa)
    • Insertions and deletions (indels) add or remove nucleotides from the DNA sequence
  • Recombination shuffles genetic material between homologous chromosomes during meiosis, creating new combinations of alleles
  • Gene duplication events create additional copies of genes, allowing for the evolution of new functions or specialization of existing functions (e.g., globin gene family in vertebrates)
  • Horizontal gene transfer involves the transfer of genetic material between organisms, often in prokaryotes, leading to the acquisition of new traits (e.g., antibiotic resistance genes in bacteria)
  • Transposable elements (transposons) are mobile genetic elements that can move within genomes, contributing to genomic rearrangements and gene regulation
  • Epigenetic modifications, such as DNA methylation and histone modifications, can influence gene expression without changing the underlying DNA sequence

DNA Sequencing and Comparative Genomics

  • DNA sequencing technologies, such as Sanger sequencing and next-generation sequencing (NGS), enable the determination of nucleotide sequences
    • Sanger sequencing is a traditional method that uses dideoxy chain termination and is suitable for sequencing short DNA fragments
    • NGS platforms (Illumina, PacBio, Oxford Nanopore) allow for high-throughput, parallel sequencing of millions of DNA fragments simultaneously
  • Whole genome sequencing provides the complete DNA sequence of an organism's genome, facilitating comparative genomic analyses
  • Comparative genomics involves comparing genomes across different species to identify conserved regions, gene families, and evolutionary relationships
    • Orthologous genes are homologous genes that diverged due to speciation and often retain similar functions across species
    • Paralogous genes arise from gene duplication events within a species and may evolve new or specialized functions
  • Sequence alignment algorithms (BLAST, MUSCLE, MAFFT) are used to identify similarities and differences between DNA or protein sequences
  • Synteny analysis examines the conservation of gene order and orientation across genomes, providing insights into genome evolution and rearrangements

Phylogenetic Tree Construction

  • Phylogenetic trees represent evolutionary relationships among organisms or genes, with branches indicating divergence from a common ancestor
  • Sequence alignment is the first step in phylogenetic analysis, where homologous sequences are arranged to identify conserved and variable regions
  • Distance-based methods (UPGMA, neighbor-joining) construct phylogenetic trees based on pairwise genetic distances between sequences
    • UPGMA (Unweighted Pair Group Method with Arithmetic Mean) assumes a constant rate of evolution and produces rooted trees
    • Neighbor-joining is a bottom-up clustering algorithm that does not assume a constant rate of evolution and produces unrooted trees
  • Character-based methods (maximum parsimony, maximum likelihood) use discrete character states (nucleotides or amino acids) to infer phylogenetic relationships
    • Maximum parsimony seeks the tree that requires the fewest evolutionary changes to explain the observed character states
    • Maximum likelihood estimates the probability of observing the data given a specific evolutionary model and selects the tree with the highest likelihood
  • Bayesian inference incorporates prior knowledge and calculates the posterior probability of phylogenetic trees using Markov chain Monte Carlo (MCMC) algorithms
  • Bootstrapping is a statistical technique used to assess the reliability of phylogenetic tree branches by resampling the original data with replacement

Evolutionary Models and Algorithms

  • Evolutionary models describe the process of nucleotide or amino acid substitution over time and are used in phylogenetic inference
    • Jukes-Cantor (JC) model assumes equal rates of substitution between all nucleotides and is the simplest substitution model
    • Kimura two-parameter (K2P) model distinguishes between transition and transversion rates
    • General time-reversible (GTR) model allows for different rates of substitution between all pairs of nucleotides
  • Amino acid substitution models (PAM, BLOSUM) describe the probability of amino acid replacements based on empirical data from protein alignments
  • Markov models are probabilistic models that assume the future state depends only on the current state and not on the past states
    • Hidden Markov models (HMMs) are used to identify sequence motifs, protein domains, and gene structure in DNA or protein sequences
  • Heuristic algorithms (hill-climbing, simulated annealing) are used to search for optimal tree topologies when the number of possible trees is large
  • Genetic algorithms mimic the process of natural selection to optimize phylogenetic tree inference by evolving a population of candidate trees

Applications in Molecular Biology

  • Molecular phylogenetics is used to study the evolutionary relationships among organisms, genes, or proteins
    • Phylogenetic analysis can help identify the origin and spread of viral outbreaks (e.g., tracing the evolutionary history of SARS-CoV-2)
    • Phylogenetic trees can be used to infer the evolutionary history of gene families and identify gene duplication and loss events
  • Molecular evolution studies provide insights into the mechanisms of adaptation and the genetic basis of complex traits
    • Comparative genomics can identify genes and regulatory elements associated with specific phenotypes or adaptations (e.g., genes involved in human brain evolution)
  • Evolutionary analysis aids in the development of vaccines and drugs by identifying conserved targets and predicting the emergence of resistance
  • Molecular clock dating is used to estimate the timing of evolutionary events, such as species divergences or the emergence of new traits
    • Molecular clock analysis has been used to date the origin of major taxonomic groups (e.g., estimating the divergence time between birds and mammals)
  • Evolutionary principles are applied in protein engineering and directed evolution to create novel proteins with desired functions (e.g., designing enzymes for industrial applications)

Challenges and Future Directions

  • Incomplete or biased sampling of taxa can lead to inaccurate phylogenetic inferences, emphasizing the need for comprehensive taxonomic sampling
  • Horizontal gene transfer and hybridization events can complicate phylogenetic reconstruction by introducing non-tree-like evolutionary patterns
  • Heterogeneous evolutionary rates across lineages or sites violate the assumptions of some phylogenetic methods and require the use of more complex models
  • Large-scale genomic data poses computational challenges for phylogenetic inference, necessitating the development of efficient algorithms and high-performance computing resources
  • Integration of different data types (e.g., genomic, transcriptomic, proteomic) can provide a more comprehensive understanding of evolutionary processes
  • Advances in long-read sequencing technologies (PacBio, Oxford Nanopore) enable the resolution of complex genomic regions and improve the accuracy of phylogenetic inferences
  • Machine learning approaches, such as deep learning, offer new opportunities for analyzing large-scale genomic data and inferring complex evolutionary patterns

Key Takeaways and Study Tips

  • Understand the central concepts of molecular evolution, including neutral theory, molecular clock hypothesis, and evolutionary forces shaping genetic variation
  • Be familiar with the mechanisms of molecular change, such as mutations, recombination, gene duplication, and horizontal gene transfer
  • Know the principles and applications of DNA sequencing technologies and comparative genomics in studying molecular evolution
  • Understand the methods and algorithms used in phylogenetic tree construction, including distance-based, character-based, and Bayesian approaches
  • Be able to interpret and evaluate phylogenetic trees, considering factors such as branch support, evolutionary models, and potential biases
  • Recognize the applications of molecular evolution in various fields of molecular biology, such as studying the origin of species, identifying genes associated with adaptations, and developing vaccines and drugs
  • Be aware of the challenges and future directions in molecular evolution research, including the need for comprehensive sampling, addressing non-tree-like evolution, and integrating different data types
  • Practice solving problems related to sequence alignment, phylogenetic tree construction, and evolutionary model selection
  • Engage in active learning by discussing concepts with classmates, creating concept maps, and applying knowledge to real-world examples


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.