intro to algorithms unit 12 study guides

greedy algorithms: scheduling & coding

12.1

Greedy algorithm paradigm and properties

12.2

Interval scheduling problem and solutions

12.3

Huffman coding algorithm and data compression

12.4

Comparison of greedy and dynamic programming approaches

unit 12 review

Greedy algorithms are optimization techniques that make locally optimal choices at each step, aiming for a global optimum. They're used in scheduling problems to allocate resources efficiently over time, minimizing completion time or meeting deadlines. These algorithms are simple to implement but don't always yield optimal solutions. Scheduling involves assigning tasks to resources, considering factors like processing times, due dates, and machine availability. Common problems include single machine, parallel machine, and flow shop scheduling. Greedy approaches often use priority rules to rank tasks, making them fast but not always optimal for complex scenarios.

What Are Greedy Algorithms?

Optimization algorithms make the locally optimal choice at each stage with the hope of finding a global optimum
Build up a solution piece by piece, always choosing the next piece that offers the most obvious and immediate benefit
Make a locally optimal choice in the hope that this choice will lead to a globally optimal solution
Do not always yield optimal solutions, but may approximate the optimal solution in a reasonable time
Produce good solutions on some mathematical problems, but not on others
- Work well for problems that have "optimal substructure"
- If a problem can be solved optimally by breaking it into sub-problems and then recursively finding the optimal solutions to the sub-problems, it is said to have optimal substructure
Commonly used for optimization problems (finding the best solution from all feasible solutions)
Generally easy to implement and understand compared to other optimization algorithms like dynamic programming

Key Concepts in Scheduling

Scheduling involves allocating resources over time to perform a collection of tasks
Objective is typically to minimize the total time to complete all tasks (makespan) or meet certain deadlines
Tasks may have dependencies (precedence constraints) where some tasks can only start after others have completed
Resources can include machines, workers, or other limited assets needed to complete tasks
Preemption allows tasks to be paused and resumed later, while non-preemptive scheduling requires each task to run to completion once started
Release time is the earliest a task can begin, while the deadline is the latest a task can complete without penalty
Lateness is the amount of time a task finishes after its deadline, while tardiness only counts positive lateness values
Idle time is any period where a resource is not being used but is available

Common Scheduling Problems

Single machine scheduling minimizes total completion time with tasks that have different processing times
- Shortest processing time (SPT) first is optimal
Parallel machine scheduling assigns tasks to multiple identical machines to minimize makespan
- Longest processing time (LPT) first is a good heuristic
Flow shop scheduling has multiple stages each with one machine, and all tasks follow the same route
- No passing is allowed between machines
Job shop scheduling also has multiple stages but tasks can have different routes
- Passing is allowed, making it more complex than flow shop
Open shop scheduling allows tasks to be processed in any order on the machines
- Combines both job shop and parallel machine features
Resource constrained project scheduling (RCPSP) schedules project activities subject to precedence and resource capacity constraints
- Minimizes total project duration
Workforce scheduling assigns workers to shifts or tasks over a time period
- Considers worker availability, skills, preferences and labor regulations

Greedy Approach to Scheduling

Greedy algorithms build a schedule incrementally by making the locally optimal choice at each decision point
Often use a priority rule or dispatch policy to rank tasks by importance or urgency
- Shortest processing time first (SPT) minimizes total completion time
- Earliest due date first (EDD) minimizes maximum lateness
- Minimum slack first (MS) minimizes maximum tardiness
- Longest processing time first (LPT) minimizes makespan on parallel machines
Greedy is not always optimal but is usually fast and simple to implement
Can be used as a heuristic to seed other optimization methods like local search
Examples of greedy scheduling algorithms:
- List scheduling for parallel machines and precedence constraints
- Moore's algorithm for single machine scheduling with due dates
- Coffman-Graham algorithm for multiprocessor task scheduling with precedence constraints
- Hu's algorithm for precedence constrained scheduling on two machines

Coding Greedy Algorithms

Greedy algorithms can be implemented using simple loops and conditional statements
Often use a sorting step to rank tasks by the greedy selection criteria
Break ties arbitrarily when two choices have equal value
Data structures like priority queues can make selection of the best choice efficient

Pseudocode for a generic greedy algorithm:

function Greedy(a, n) 
    for i from 1 to n
        x = Select(a)
        if Feasible(x) then
            Solution = Solution + x
    return Solution

Python has built-in functions and libraries useful for implementing greedy algorithms
- sorted() and sort() to order data
- heapq for efficient priority queues
- queue.PriorityQueue from the standard library
May need to define a custom comparison function for sorting tasks by the greedy criteria
Important to test code on a range of problem instances to check for correctness and efficiency

Pros and Cons of Greedy Methods

Advantages:

Simple to understand and implement
Very efficient, usually run in O(n log n) time or better
Can be used on a wide variety of problems
Often give solutions that are close to optimal
Useful as heuristics or approximation algorithms when optimal methods are too slow Disadvantages:
Not always optimal, may get stuck in a local optimum
Difficult to prove that a greedy algorithm always gives the best solution
May require sorting which adds to the complexity
Greedy choices can depend on the order of the input
Less powerful than more complex methods like dynamic programming
Cannot be applied to problems without optimal substructure

Real-World Applications

Scheduling tasks or jobs on machines or processors
- Manufacturing and production scheduling
- Multiprocessor and parallel machine scheduling
Project management and resource allocation
- Scheduling construction tasks based on worker and equipment availability
Workforce scheduling and planning
- Assigning nurses to shifts based on skills and preferences
Supply chain and logistics optimization
- Scheduling shipments and deliveries by truck or plane
Interval scheduling and bandwidth allocation
- Scheduling ads in radio/tv broadcasting
- Assigning bandwidth to maximize number of users
Bioinformatics and DNA sequencing
- Fragment assembly in shotgun sequencing
Huffman coding and data compression
- Constructing optimal prefix-free codes

Practice Problems and Solutions

Single machine scheduling with due dates and minimizing maximum lateness
- Input: processing times $p_i$ and due dates $d_i$ for $n$ tasks
- Output: a schedule that minimizes $L_max = max_i (C_i - d_i)$ where $C_i$ is the completion time of task $i$
- Greedy algorithm: Earliest Due Date (EDD) first
  - Sort tasks in non-decreasing order of due dates $d_i$
  - Schedule tasks in this order
- Optimality: EDD is optimal for this problem
- Time complexity: O(n log n) for sorting
Parallel machine scheduling with the objective of minimizing makespan
- Input: processing times $p_i$ for $n$ tasks and $m$ identical parallel machines
- Output: an assignment of tasks to machines that minimizes the makespan $C_max = max_j C(M_j)$
- Greedy algorithm: Longest Processing Time (LPT) first
  - Sort tasks in non-increasing order of processing times $p_i$
  - Assign each task to the machine that will be idle first
- Performance: LPT has a worst-case ratio of 4/3 - 1/(3m) compared to optimal makespan
- Time complexity: O(n log n) for sorting, O(n log m) for assignment with a priority queue
Interval scheduling to maximize number of compatible tasks scheduled
- Input: a set of $n$ tasks where each task has a start time $s_i$ and finish time $f_i$
- Output: a maximum cardinality subset of mutually compatible tasks, i.e. no overlap in time
- Greedy algorithm: Earliest finish time first
  - Sort tasks by non-decreasing order of finish times $f_i$
  - Go through sorted list, selecting a task if compatible with previously selected tasks
- Optimality: Earliest finish time is optimal for this problem
- Time complexity: O(n log n) for sorting

intro to algorithms unit 12 study guides

unit 12 review

What Are Greedy Algorithms?

Key Concepts in Scheduling

Common Scheduling Problems

Greedy Approach to Scheduling

Coding Greedy Algorithms

Pros and Cons of Greedy Methods

Real-World Applications

Practice Problems and Solutions

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

hs classes

Study Content & Tools

Company

Resources