Before getting a solution to this question, the fact that the due dates are abitrary real values seems to throw a wrench in the gears. We can think about this question clearly when the due dates are natural numbers because then we know that we can place our events exactly on integers, otherwise we could place them in any position, this greatly complicates things. To deal with this we will note the following:
Suppose that we have any optimal solution , we claim that a new solution can be produced which has the same profit. This new solution is one where the solution only schedules at integer values, less than or equal to their original start time.
We can prove this true with induction using the following claim "We can edit the original optimal solution up to index by moving events leftward and snapping them to integer values while maintaining optimality". We'll sketch the proof, in the base case we know that we can move our first scheduled event all the way to , this is because is an integer less than or equal to all start times, also moving this event backward will never collide with anything else because there are no events before it, it also maintains the same profit since is still scheduled before it's due date.
In the induction step we assume that we can do this up to event , now we note that the finish time of event is at some integer , it's not hard to see that for event it can be slid back to start at time or any integer between and it's original start time, because it will have nothing to collide with there and it maintained it's deadline.
Therefore we know that given any sequence of deadlines, we have some optimal solutions which all yield the same profit, we can then select one of them and "floor" it. Notationally, we'll denote this optimal solution by . The reason why this is good is that we can try to come up with an algorithm that schedules events at integer values and it's still possible for this algorithm to be optimal.
We will also note that for any optimal solution it will use at most jobs, and thus any optimal solution can be contracted to a solution taking up at most integers (by moving back deadlines).
Take the list of jobs and sort them by profit by highest profit first. Now we iterate through these jobs and schedule them at the latest possible time before their deadline where this time is an integer.
class ScheduleTracker: union_find = UnionFind() sets = [] int earliest_scheduled_time = inf schedule = [] def init(deadlines): latest_deadline = max(deadlines) for i in range(0, latest_deadline + 1): sets.append(union_find.make_set(i)) def set_event(time, event): union_find.union(sets[time - 1], sets[time]) earliest_scheduled_time = event.start_time schedule.append(event) def get_latest_possible_schedule_time(event): return union_find.find(event.deadline) def is_possible_to_schedule(event): return get_latest_possible_schedule_time(event) != 0 def maximize_profits(events): events = events_sorted_non_increasing_by_profit(events) schedule_tracker = ScheduleTracker() for event in events: if schedule_trakcer.is_possible_to_schedule(event): time = schedule_tracker.get_latest_possible_time(event) schedule_tracker.set_event(time, event) return schedule_tracker.get_profit()
We'll start by showing this yields an optimal solution. We'll end with time complexity. The proof of correctness of this algorithm is not trivial, we will employ some additional machinery to make our life easier.
Let be the jobs sorted by non-increasing profits, let be the schedule that greedy has come up with after iterations of the algorithm. Now observe this statement:
: " can be extended to an optimal schedule by only adding jobs from the set ".
When in the above statement it reads, can be extended to an optimal schedule only adding jobs from , which means that is optimal. Since after iterations greedy will terminate, then is what is returned by greedy, so it will return an optimal solution.
Therefore to prove that our algorithm is optimal, we can prove the above statement up to , we'll use induction for this.
We want to show that holds true, that is, we want to show that can be extended to an optimal schedule by only adding jobs from the set . Suppose is the optimal solution given these jobs, and consider the "floored" optimal solution , After zero iterations of the greedy algorithm have occurred, nothing yet has happened, therefore nothing is scheduled , then we can trivially just extend this by constructing so it is an optimal extension
Now let , and assume that holds true, we'd like to show that holds true. That is we must prove that can be extended to an optimal schedule by only using jobs from . Note that our induction hypothesis assumes that could be extended to an optimal solution only using jobs , let's denote this optimal solution by
At this point, we will now do cases based on how our algorithm operates. First of all when we consider the job , there are two possibilities, either it's not able to fit into the schedule, or it is. Suppose it cannot be scheduled, then this means because our schedule has stayed the same, but we know that can be extended to , since is the same as , then it can also be extended to .
Another option is that is scheduled by greedy. If it is scheduled it is scheduled at some time (a natural number). We must also see that either contains or it does not. If does contain , then we note that first of all, it is added in a way so that it doesn't collide with anything in because it was an extension, and since it is a solution we know that it scheduled it at a time before or equal to its deadline, let's denote this time as
At the same time, greedy also did the same thing, it took and it scheuled it in a way that extended and did not collide with anything previously, and it also scheduled it at a time before or equal to it's deadline, with the one extra assumption that it is scheduled at the latest possible time, this implies that . Quickly see that if then they've scheduled the job at the exact same position, and thus can be extended by taking 's extension and removing (which is fine because it's already scheduled in , in otherwords we have an extension of to an optimal solution using only
Now for the final case, we have that did schedule but did not schedule it. Recall that greedy scheduled this job at time , we'll now show that must have something scheduled at that time, for if it did not, then since is a solution where each event is scheuled on integer values, we know that can fit in job while not overlapping another event, this means that this new solution is still feasable, but it's profit has increased, but was optimal, this is a contradiction, so therefore must have something scheduled at time .
Suppose has a job scheduled at time , we note that because we're under the assumption that didn't schedule that job. But we know that was not scheuled yet by , this means it was part of the extension to , therefore we have , recall that our jobs were sorted in non-decreasing order which means that , let's create a new solution where we take and replace with denote this solution by , note that , note that the only way to avoid contradiction is to have that , if that's true then has the same profit as therefore it is optimal. Note that included and extends it to an optimal solution only using jobs which proves what we set out to show.
We also have to prove that ScheduleTracker
actually returns correct values as well. Since we've already have a formal proof for why the overall algorithm works, we'll sketch the proof here. Note that is_possible_to_schedule
is correct if get_latest_possible_schedule_time
is correct.
For convience let denote: latest time to scehdule a job before or equal to time . We can see that it must satisfy the following Since our algorithm returns the find of a time, then we want to show that find(t)
= in any situation. That is that the representative of any set is
We first iterate over all integer times before the last deadline and make a set in the union find with that specific time, this means that each time is it's own representative, we can see that now given some integer time before any jobs are scheduled, then which is exactly find(t)
as needed.
Assume that find(t) =
on a schedule with jobs scheduled, now we'll show it holds true on a schedule with jobs scheduled. Suppose that our algorithm has just added one new job at time , by the algorithms we only schedule at position that was not already scheduled this means , when there were jobs our algorithm unions the set is in with the one at time called this creates the new set the representative of is the representative in , denote this representative by
Note that given any time in , it's representative only changed if it was an element in . For any one of these times, used to be their representative, but now their representative is , we'll prove that , which shows the statement. Recall that before the union we used ot have , but now we have a job at time , therefore now we have that , this can be seen because slot is filled so we can only choose the latest time coming before it. But and so by the induction hypothesis find( ) = r as needed.
Therefore we've shown that at any stage, the union-find method actually returns the latest time to schedule a job before or equal to time .
The cost of this alg:
Therefore we can keep this algorithm at nearly (because of the ackerman function) instead of