unit 12 review
Greedy algorithms are optimization techniques that make locally optimal choices at each step, aiming for a global optimum. They're used in scheduling problems to allocate resources efficiently over time, minimizing completion time or meeting deadlines. These algorithms are simple to implement but don't always yield optimal solutions.
Scheduling involves assigning tasks to resources, considering factors like processing times, due dates, and machine availability. Common problems include single machine, parallel machine, and flow shop scheduling. Greedy approaches often use priority rules to rank tasks, making them fast but not always optimal for complex scenarios.
What Are Greedy Algorithms?
- Optimization algorithms make the locally optimal choice at each stage with the hope of finding a global optimum
- Build up a solution piece by piece, always choosing the next piece that offers the most obvious and immediate benefit
- Make a locally optimal choice in the hope that this choice will lead to a globally optimal solution
- Do not always yield optimal solutions, but may approximate the optimal solution in a reasonable time
- Produce good solutions on some mathematical problems, but not on others
- Work well for problems that have "optimal substructure"
- If a problem can be solved optimally by breaking it into sub-problems and then recursively finding the optimal solutions to the sub-problems, it is said to have optimal substructure
- Commonly used for optimization problems (finding the best solution from all feasible solutions)
- Generally easy to implement and understand compared to other optimization algorithms like dynamic programming
Key Concepts in Scheduling
- Scheduling involves allocating resources over time to perform a collection of tasks
- Objective is typically to minimize the total time to complete all tasks (makespan) or meet certain deadlines
- Tasks may have dependencies (precedence constraints) where some tasks can only start after others have completed
- Resources can include machines, workers, or other limited assets needed to complete tasks
- Preemption allows tasks to be paused and resumed later, while non-preemptive scheduling requires each task to run to completion once started
- Release time is the earliest a task can begin, while the deadline is the latest a task can complete without penalty
- Lateness is the amount of time a task finishes after its deadline, while tardiness only counts positive lateness values
- Idle time is any period where a resource is not being used but is available
Common Scheduling Problems
- Single machine scheduling minimizes total completion time with tasks that have different processing times
- Shortest processing time (SPT) first is optimal
- Parallel machine scheduling assigns tasks to multiple identical machines to minimize makespan
- Longest processing time (LPT) first is a good heuristic
- Flow shop scheduling has multiple stages each with one machine, and all tasks follow the same route
- No passing is allowed between machines
- Job shop scheduling also has multiple stages but tasks can have different routes
- Passing is allowed, making it more complex than flow shop
- Open shop scheduling allows tasks to be processed in any order on the machines
- Combines both job shop and parallel machine features
- Resource constrained project scheduling (RCPSP) schedules project activities subject to precedence and resource capacity constraints
- Minimizes total project duration
- Workforce scheduling assigns workers to shifts or tasks over a time period
- Considers worker availability, skills, preferences and labor regulations
Greedy Approach to Scheduling
- Greedy algorithms build a schedule incrementally by making the locally optimal choice at each decision point
- Often use a priority rule or dispatch policy to rank tasks by importance or urgency
- Shortest processing time first (SPT) minimizes total completion time
- Earliest due date first (EDD) minimizes maximum lateness
- Minimum slack first (MS) minimizes maximum tardiness
- Longest processing time first (LPT) minimizes makespan on parallel machines
- Greedy is not always optimal but is usually fast and simple to implement
- Can be used as a heuristic to seed other optimization methods like local search
- Examples of greedy scheduling algorithms:
- List scheduling for parallel machines and precedence constraints
- Moore's algorithm for single machine scheduling with due dates
- Coffman-Graham algorithm for multiprocessor task scheduling with precedence constraints
- Hu's algorithm for precedence constrained scheduling on two machines
Coding Greedy Algorithms
Pros and Cons of Greedy Methods
Advantages:
- Simple to understand and implement
- Very efficient, usually run in O(n log n) time or better
- Can be used on a wide variety of problems
- Often give solutions that are close to optimal
- Useful as heuristics or approximation algorithms when optimal methods are too slow
Disadvantages:
- Not always optimal, may get stuck in a local optimum
- Difficult to prove that a greedy algorithm always gives the best solution
- May require sorting which adds to the complexity
- Greedy choices can depend on the order of the input
- Less powerful than more complex methods like dynamic programming
- Cannot be applied to problems without optimal substructure
Real-World Applications
- Scheduling tasks or jobs on machines or processors
- Manufacturing and production scheduling
- Multiprocessor and parallel machine scheduling
- Project management and resource allocation
- Scheduling construction tasks based on worker and equipment availability
- Workforce scheduling and planning
- Assigning nurses to shifts based on skills and preferences
- Supply chain and logistics optimization
- Scheduling shipments and deliveries by truck or plane
- Interval scheduling and bandwidth allocation
- Scheduling ads in radio/tv broadcasting
- Assigning bandwidth to maximize number of users
- Bioinformatics and DNA sequencing
- Fragment assembly in shotgun sequencing
- Huffman coding and data compression
- Constructing optimal prefix-free codes
Practice Problems and Solutions
- Single machine scheduling with due dates and minimizing maximum lateness
- Input: processing times $p_i$ and due dates $d_i$ for $n$ tasks
- Output: a schedule that minimizes $L_max = max_i (C_i - d_i)$ where $C_i$ is the completion time of task $i$
- Greedy algorithm: Earliest Due Date (EDD) first
- Sort tasks in non-decreasing order of due dates $d_i$
- Schedule tasks in this order
- Optimality: EDD is optimal for this problem
- Time complexity: O(n log n) for sorting
- Parallel machine scheduling with the objective of minimizing makespan
- Input: processing times $p_i$ for $n$ tasks and $m$ identical parallel machines
- Output: an assignment of tasks to machines that minimizes the makespan $C_max = max_j C(M_j)$
- Greedy algorithm: Longest Processing Time (LPT) first
- Sort tasks in non-increasing order of processing times $p_i$
- Assign each task to the machine that will be idle first
- Performance: LPT has a worst-case ratio of 4/3 - 1/(3m) compared to optimal makespan
- Time complexity: O(n log n) for sorting, O(n log m) for assignment with a priority queue
- Interval scheduling to maximize number of compatible tasks scheduled
- Input: a set of $n$ tasks where each task has a start time $s_i$ and finish time $f_i$
- Output: a maximum cardinality subset of mutually compatible tasks, i.e. no overlap in time
- Greedy algorithm: Earliest finish time first
- Sort tasks by non-decreasing order of finish times $f_i$
- Go through sorted list, selecting a task if compatible with previously selected tasks
- Optimality: Earliest finish time is optimal for this problem
- Time complexity: O(n log n) for sorting