Fiveable

🚗Autonomous Vehicle Systems Unit 5 Review

QR code for Autonomous Vehicle Systems practice questions

5.5 Decision-making algorithms

5.5 Decision-making algorithms

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🚗Autonomous Vehicle Systems
Unit & Topic Study Guides

Decision-making algorithms are the brain of autonomous vehicles, enabling them to navigate complex environments and make real-time choices. These algorithms process sensor data, interpret surroundings, and determine appropriate actions, forming the core of self-driving technology.

From rule-based approaches to advanced machine learning techniques, decision-making algorithms come in various forms. Understanding these methods is crucial for designing robust autonomous systems that can handle the unpredictable nature of real-world driving scenarios.

Types of decision-making algorithms

  • Decision-making algorithms form the core of autonomous vehicle systems, enabling vehicles to navigate complex environments and make real-time choices
  • These algorithms process sensor data, interpret the surrounding environment, and determine appropriate actions for the vehicle to take
  • Understanding different types of decision-making algorithms helps in designing robust and adaptable autonomous systems

Rule-based vs learning-based approaches

  • Rule-based approaches use predefined sets of if-then statements to make decisions
    • Advantages include interpretability and predictability
    • Limitations involve difficulty in handling complex or unforeseen scenarios
  • Learning-based approaches utilize machine learning techniques to learn decision-making patterns from data
    • Advantages include adaptability and ability to handle complex scenarios
    • Challenges include the need for large amounts of training data and potential unpredictability
  • Hybrid approaches combine rule-based and learning-based methods to leverage strengths of both

Deterministic vs probabilistic methods

  • Deterministic methods produce the same output for a given input every time
    • Useful for well-defined scenarios with clear rules (traffic light responses)
    • Limited in handling uncertainty or ambiguous situations
  • Probabilistic methods incorporate uncertainty into decision-making process
    • Use probability distributions to model various possible outcomes
    • Better suited for real-world scenarios with inherent uncertainties (pedestrian behavior)
  • Bayesian methods combine prior knowledge with new observations to update probabilities

Reactive vs deliberative algorithms

  • Reactive algorithms make immediate decisions based on current sensor inputs
    • Fast response times suitable for low-level control (obstacle avoidance)
    • Limited ability to plan for long-term goals or complex scenarios
  • Deliberative algorithms consider multiple future states and plan accordingly
    • Capable of long-term planning and optimization (route planning)
    • Require more computational resources and may have slower response times
  • Hybrid architectures combine reactive and deliberative elements for balanced decision-making

Markov decision processes

  • Markov decision processes (MDPs) provide a mathematical framework for modeling decision-making in uncertain environments
  • MDPs are fundamental to many reinforcement learning and planning algorithms used in autonomous vehicles
  • They allow for the formulation of sequential decision problems under uncertainty, crucial for navigating dynamic traffic scenarios

State representation

  • States in MDPs capture all relevant information about the environment and vehicle
    • Include vehicle position, speed, orientation, and surrounding objects
    • May also incorporate higher-level information (traffic rules, road conditions)
  • Continuous state spaces require discretization or function approximation techniques
  • Partial observability leads to POMDPs (Partially Observable Markov Decision Processes)
    • Account for sensor limitations and hidden state variables

Action space definition

  • Actions represent possible decisions the autonomous vehicle can make
    • Include steering angles, acceleration/deceleration, lane changes
  • Discrete action spaces simplify decision-making but may limit fine control
  • Continuous action spaces allow for more precise control but increase complexity
  • Action constraints ensure physical limitations and safety requirements are met

Transition probabilities

  • Transition probabilities model the likelihood of moving from one state to another given an action
    • Capture uncertainties in vehicle dynamics and environmental factors
    • May be learned from data or derived from physical models
  • Stochastic transitions account for unpredictable elements (other drivers, pedestrians)
  • Deterministic special cases simplify computations but may not fully represent reality

Reward function design

  • Reward functions quantify the desirability of state-action pairs
    • Encourage safe, efficient, and comfortable driving behaviors
    • Balance multiple objectives (safety, speed, fuel efficiency, passenger comfort)
  • Sparse rewards provide feedback only at key events (reaching destination, collision)
  • Dense rewards offer continuous feedback but may be harder to design effectively
  • Inverse reinforcement learning techniques can infer reward functions from expert demonstrations

Reinforcement learning

  • Reinforcement learning (RL) enables autonomous vehicles to learn optimal decision-making policies through interaction with the environment
  • RL algorithms balance exploration of new actions with exploitation of known good actions
  • RL approaches can adapt to changing environments and learn complex behaviors without explicit programming

Q-learning algorithm

  • Q-learning estimates the value of state-action pairs through iterative updates
    • Uses temporal difference learning to propagate rewards backwards in time
    • Learns an optimal policy without requiring a model of the environment
  • Q-table stores estimated values for each state-action pair
    • Suffers from curse of dimensionality in large state spaces
  • ε-greedy exploration strategy balances exploration and exploitation
    • Chooses random actions with probability ε, otherwise selects best-known action

Policy gradient methods

  • Policy gradient algorithms directly optimize the policy function
    • Learn a probability distribution over actions for each state
    • Can handle continuous action spaces more naturally than value-based methods
  • REINFORCE algorithm uses Monte Carlo sampling to estimate policy gradients
    • High variance in gradient estimates can lead to slow learning
  • Actor-Critic methods combine value function approximation with policy optimization
    • Reduce variance in gradient estimates for more stable learning
    • Actor network learns policy, critic network estimates value function

Deep reinforcement learning

  • Deep RL combines deep neural networks with reinforcement learning algorithms
    • Enables learning from high-dimensional input spaces (camera images, lidar data)
    • Can learn complex non-linear decision policies
  • Deep Q-Networks (DQN) use convolutional neural networks to approximate Q-functions
    • Experience replay and target networks stabilize learning
  • Proximal Policy Optimization (PPO) improves policy gradient methods
    • Clips policy updates to prevent destructively large changes
  • Soft Actor-Critic (SAC) incorporates entropy maximization for improved exploration
    • Learns stochastic policies that can capture multiple modes of optimal behavior

Planning algorithms

  • Planning algorithms enable autonomous vehicles to generate and evaluate sequences of actions to achieve goals
  • These algorithms consider future states and potential outcomes to make informed decisions
  • Planning is crucial for navigation, obstacle avoidance, and optimizing vehicle trajectories
Rule-based vs learning-based approaches, Explainer: Autonomous and Semi-autonomous vehicles – Ned Hayes
  • A* algorithm finds optimal paths in discrete state spaces
    • Combines cost-to-come and heuristic cost-to-go estimates
    • Efficiently explores promising paths while avoiding unnecessary computations
  • Heuristic function guides search towards goal
    • Admissible heuristics guarantee optimal solutions
    • Consistent heuristics improve search efficiency
  • Variations like Anytime A* provide suboptimal solutions quickly and improve over time
    • Useful for real-time planning in dynamic environments

Rapidly-exploring random trees

  • Rapidly-exploring Random Trees (RRT) efficiently explore high-dimensional continuous spaces
    • Randomly sample configurations and connect them to form a tree structure
    • Biased towards unexplored regions of the state space
  • RRT* variant guarantees asymptotic optimality
    • Rewires tree connections to improve path quality over time
  • Kinodynamic RRT incorporates vehicle dynamics constraints
    • Generates feasible trajectories considering acceleration and steering limits
  • Bidirectional RRT grows trees from both start and goal configurations
    • Improves planning efficiency in complex environments

Model predictive control

  • Model Predictive Control (MPC) optimizes control actions over a finite time horizon
    • Uses a model of vehicle dynamics to predict future states
    • Recomputes optimal control sequence at each time step
  • Receding horizon approach adapts to changing environments and disturbances
    • Implements only first control action, then re-plans
  • Handles constraints on states and control inputs explicitly
    • Ensures vehicle stays within physical and safety limits
  • Non-linear MPC accounts for complex vehicle dynamics
    • Computationally intensive but more accurate for high-speed or extreme maneuvers

Behavior prediction

  • Behavior prediction algorithms anticipate the actions of other road users
  • These predictions inform decision-making and planning processes for autonomous vehicles
  • Accurate behavior prediction is crucial for safe and efficient navigation in dynamic environments

Trajectory forecasting

  • Trajectory forecasting predicts future positions and velocities of other road users
    • Uses historical motion data and current state information
    • Accounts for road geometry and traffic rules
  • Physics-based models use kinematic equations for short-term predictions
    • Accurate for well-behaved vehicles but struggle with complex interactions
  • Machine learning approaches learn patterns from large datasets
    • Recurrent Neural Networks (RNNs) capture temporal dependencies
    • Generative models produce multiple plausible future trajectories

Intention estimation

  • Intention estimation infers high-level goals and plans of other road users
    • Predicts lane changes, turns, and other maneuvers
    • Incorporates contextual information (turn signals, road signs, vehicle positioning)
  • Hidden Markov Models (HMMs) model intentions as latent states
    • Probabilistic transitions between intention states
    • Observations linked to intentions through emission probabilities
  • Inverse Reinforcement Learning (IRL) infers reward functions driving behavior
    • Assumes other agents act to maximize some unknown reward
    • Enables more accurate long-term predictions

Interaction-aware prediction

  • Interaction-aware methods consider mutual influences between road users
    • Model how vehicles react to each other's actions
    • Capture complex scenarios (merging, negotiating intersections)
  • Game-theoretic approaches model multi-agent decision-making
    • Nash equilibria represent stable interaction outcomes
    • Stackelberg games model leader-follower dynamics
  • Social LSTM and similar architectures pool information from multiple agents
    • Learn to predict coordinated behaviors in crowded scenes
  • Attention mechanisms focus on relevant interactions
    • Dynamically weight importance of different agents and features

Decision trees

  • Decision trees provide a hierarchical approach to decision-making in autonomous vehicles
  • They offer interpretable models that can handle both continuous and categorical variables
  • Decision trees form the basis for more advanced ensemble methods used in autonomous driving

Binary vs multi-class trees

  • Binary trees split nodes based on a single feature and threshold
    • Simple and efficient but may require deep trees for complex decisions
    • Each internal node has exactly two children
  • Multi-class trees allow multiple branches at each node
    • Can make decisions based on categorical variables with more than two categories
    • Often more compact representation for certain types of decisions
  • Oblique decision trees use linear combinations of features for splits
    • Can capture more complex decision boundaries
    • Harder to interpret but potentially more powerful

Pruning techniques

  • Pruning reduces tree complexity to prevent overfitting
    • Removes branches that do not significantly improve decision quality
    • Improves generalization to unseen data
  • Pre-pruning stops tree growth based on criteria during construction
    • Minimum number of samples per leaf
    • Maximum tree depth
    • Minimum improvement in splitting criterion
  • Post-pruning removes branches after full tree construction
    • Cost-complexity pruning balances tree size and accuracy
    • Reduced error pruning uses a validation set to evaluate pruning decisions
  • Pruning helps create more robust decision-making models for autonomous vehicles

Random forests

  • Random forests ensemble multiple decision trees for improved performance
    • Each tree is trained on a bootstrap sample of the data
    • Random subset of features considered at each split
  • Voting mechanism combines predictions from individual trees
    • Classification uses majority vote
    • Regression averages predictions
  • Feature importance can be derived from random forest models
    • Helps identify key factors in autonomous vehicle decision-making
  • Out-of-bag error provides built-in validation without separate test set
    • Estimates generalization performance using samples not used in tree construction

Bayesian decision theory

  • Bayesian decision theory provides a framework for making optimal decisions under uncertainty
  • It incorporates prior knowledge and new evidence to update beliefs and make informed choices
  • Bayesian methods are crucial for handling sensor noise and environmental uncertainties in autonomous vehicles

Prior and posterior probabilities

  • Prior probabilities represent initial beliefs before observing new evidence
    • Based on historical data, expert knowledge, or assumptions
    • Can be uninformative (uniform) or informative (reflecting strong prior beliefs)
  • Posterior probabilities update beliefs after observing new evidence
    • Computed using Bayes' theorem: P(AB)=P(BA)P(A)P(B)P(A|B) = \frac{P(B|A)P(A)}{P(B)}
    • Combine prior knowledge with likelihood of observations
  • Conjugate priors simplify posterior calculations
    • Prior and posterior distributions belong to the same family
    • Useful for recursive Bayesian updating in dynamic environments
Rule-based vs learning-based approaches, Frontiers | A Deeper Look at Autonomous Vehicle Ethics: An Integrative Ethical Decision-Making ...

Maximum likelihood estimation

  • Maximum Likelihood Estimation (MLE) finds parameters that maximize the likelihood of observed data
    • Assumes a probabilistic model for the data
    • Optimizes likelihood function: θ^=argmaxθP(Xθ)\hat{\theta} = \arg\max_{\theta} P(X|\theta)
  • MLE provides point estimates of model parameters
    • Does not account for uncertainty in parameter estimates
    • Can lead to overfitting with limited data
  • Expectation-Maximization (EM) algorithm applies MLE to incomplete data
    • Useful for mixture models and hidden state estimation
    • Alternates between estimating hidden variables and updating parameters

Bayesian inference

  • Bayesian inference computes full posterior distributions over parameters
    • Incorporates parameter uncertainty into decision-making
    • Posterior: P(θX)P(Xθ)P(θ)P(\theta|X) \propto P(X|\theta)P(\theta)
  • Markov Chain Monte Carlo (MCMC) methods sample from complex posterior distributions
    • Metropolis-Hastings algorithm proposes and accepts/rejects samples
    • Gibbs sampling iteratively samples from conditional distributions
  • Variational inference approximates posterior with simpler distributions
    • Minimizes Kullback-Leibler divergence between true and approximate posteriors
    • Scales better to large datasets than MCMC methods
  • Bayesian model averaging combines predictions from multiple models
    • Weights models by their posterior probabilities
    • Improves robustness of autonomous vehicle decision-making

Multi-agent decision making

  • Multi-agent decision making addresses scenarios involving multiple autonomous vehicles or other road users
  • It considers interactions, cooperation, and competition between agents
  • These approaches are crucial for coordinating traffic flow and resolving conflicts in autonomous driving

Game theory concepts

  • Game theory models strategic interactions between rational decision-makers
    • Players represent autonomous vehicles or other road users
    • Strategies correspond to possible actions or decisions
    • Payoffs quantify outcomes for each player given strategy combinations
  • Nash equilibrium represents a stable state where no player can unilaterally improve
    • si=argmaxsiui(si,si)s_i^* = \arg\max_{s_i} u_i(s_i, s_{-i}^*) for all players i
    • May not always exist or be unique
  • Pareto optimality identifies outcomes where no player can improve without harming others
    • Important for finding socially optimal solutions in traffic scenarios

Cooperative vs competitive scenarios

  • Cooperative scenarios involve agents working towards common goals
    • Platooning for improved fuel efficiency
    • Coordinated lane changes to optimize traffic flow
    • Requires communication protocols and trust between agents
  • Competitive scenarios involve conflicting objectives between agents
    • Merging into limited road space
    • Negotiating right-of-way at intersections
    • May lead to suboptimal outcomes without proper coordination
  • Mixed scenarios combine elements of cooperation and competition
    • Agents may cooperate in some aspects while competing in others
    • Requires balancing individual and collective objectives

Negotiation protocols

  • Negotiation protocols enable agents to reach agreements in multi-agent scenarios
    • Define rules for communication and decision-making
    • Aim to find mutually beneficial solutions
  • Auction-based protocols allocate resources or priorities
    • Agents bid for right-of-way or preferred routes
    • Second-price auctions incentivize truthful bidding
  • Contract Net Protocol assigns tasks among autonomous vehicles
    • Vehicles announce tasks, receive bids, and award contracts
    • Useful for distributing transportation tasks in a fleet
  • Argumentation-based negotiation allows agents to exchange reasons for preferences
    • Enables more flexible and context-aware decision-making
    • Can incorporate safety constraints and ethical considerations

Ethical considerations

  • Ethical considerations in autonomous vehicle decision-making address moral dilemmas and societal impacts
  • These issues involve balancing safety, individual rights, and collective welfare
  • Addressing ethical concerns is crucial for public acceptance and responsible deployment of autonomous vehicles

Trolley problem scenarios

  • Trolley problem variants present ethical dilemmas in unavoidable accident scenarios
    • Force choices between different negative outcomes
    • Highlight conflicts between utilitarian and deontological ethical frameworks
  • Unavoidable accident scenarios require prioritizing different types of harm
    • Passenger safety vs pedestrian safety
    • Minimizing total casualties vs protecting vulnerable road users
  • Ethical decision-making algorithms must balance multiple factors
    • Legal responsibilities
    • Social norms
    • Cultural values

Risk assessment

  • Risk assessment quantifies potential negative outcomes of decisions
    • Considers probability and severity of different accident types
    • Informs trade-offs between safety and efficiency
  • Value of Statistical Life (VSL) attempts to quantify the cost of fatality risk
    • Controversial but used in policy-making and cost-benefit analyses
    • Varies across countries and contexts
  • Risk perception differs between human drivers and autonomous systems
    • Autonomous vehicles may be held to higher safety standards
    • Public acceptance requires transparent risk assessment and communication

Liability and responsibility

  • Liability issues arise when autonomous vehicles are involved in accidents
    • Shift from driver responsibility to manufacturer or software provider liability
    • May require new legal frameworks and insurance models
  • Responsibility for ethical decision-making must be clearly defined
    • Vehicle manufacturers
    • Software developers
    • Regulators
    • Vehicle owners
  • Transparency in decision-making algorithms becomes crucial
    • Explainable AI techniques help understand and audit ethical choices
    • May be required for legal and regulatory compliance

Performance evaluation

  • Performance evaluation assesses the effectiveness and safety of decision-making algorithms in autonomous vehicles
  • It involves comparing different approaches and validating their behavior in various scenarios
  • Rigorous evaluation is essential for ensuring reliable and trustworthy autonomous driving systems

Metrics for decision quality

  • Safety metrics measure the ability to avoid accidents and dangerous situations
    • Time-to-collision (TTC)
    • Post-encroachment time (PET)
    • Frequency and severity of near-miss events
  • Efficiency metrics evaluate the optimization of travel time and resource use
    • Average speed
    • Fuel consumption
    • Traffic flow improvement
  • Comfort metrics assess the smoothness and predictability of vehicle motion
    • Jerk (rate of change of acceleration)
    • Frequency of sudden braking or steering events
  • Ethical performance metrics quantify adherence to moral principles
    • Fairness in interactions with different road users
    • Consistency in applying ethical rules across scenarios

Simulation vs real-world testing

  • Simulation testing allows for extensive scenario exploration
    • Covers rare and dangerous situations safely
    • Enables rapid iteration and parameter tuning
    • May not fully capture real-world complexity and unpredictability
  • Real-world testing validates performance in actual traffic conditions
    • Captures true sensor noise and environmental variability
    • Limited in scope due to safety concerns and regulatory restrictions
    • Essential for final validation and public trust
  • Hybrid approaches combine simulation and real-world data
    • Replay real-world sensor data in simulation environments
    • Augment real-world testing with simulated obstacles or scenarios

Benchmarking against human drivers

  • Comparative studies assess autonomous vs human driver performance
    • Reaction times
    • Decision consistency
    • Adherence to traffic rules
  • Scenario-based testing evaluates specific challenging situations
    • Complex intersections
    • Adverse weather conditions
    • Unexpected obstacles
  • Long-term studies compare accident rates and severity
    • Requires large-scale deployment and data collection
    • Accounts for different driving conditions and environments
  • Public perception and acceptance metrics
    • Surveys of passenger comfort and trust
    • Analysis of interactions between autonomous vehicles and other road users
Pep mascot
Upgrade your Fiveable account to print any study guide

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Click below to go to billing portal → update your plan → choose Yearly → and select "Fiveable Share Plan". Only pay the difference

Plan is open to all students, teachers, parents, etc
Pep mascot
Upgrade your Fiveable account to export vocabulary

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Plan is open to all students, teachers, parents, etc
report an error
description

screenshots help us find and fix the issue faster (optional)

add screenshot

2,589 studying →