🏗️ ΘρϵηΠατπ🚧 (under construction)

Motivation

Before getting a solution to this question, the fact that the due dates are abitrary real values seems to throw a wrench in the gears. We can think about this question clearly when the due dates are natural numbers because then we know that we can place our events exactly on integers, otherwise we could place them in any position, this greatly complicates things. To deal with this we will note the following:

Suppose that we have any optimal solution O=(o1,o2,om), we claim that a new solution can be produced which has the same profit. This new solution is one where the solution only schedules at integer values, less than or equal to their original start time.

We can prove this true with induction using the following claim "We can edit the original optimal solution up to index i by moving events leftward and snapping them to integer values while maintaining optimality". We'll sketch the proof, in the base case we know that we can move our first scheduled event all the way to 0, this is because 0 is an integer less than or equal to all start times, also moving this event backward will never collide with anything else because there are no events before it, it also maintains the same profit since o1 is still scheduled before it's due date.

In the induction step we assume that we can do this up to event k, now we note that the finish time of event k is at some integer j, it's not hard to see that for event k+1 it can be slid back to start at time j or any integer between j and it's original start time, because it will have nothing to collide with there and it maintained it's deadline.

Therefore we know that given any sequence of deadlines, we have some optimal solutions which all yield the same profit, we can then select one of them and "floor" it. Notationally, we'll denote this optimal solution by O. The reason why this is good is that we can try to come up with an algorithm that schedules events at integer values and it's still possible for this algorithm to be optimal.

We will also note that for any optimal solution it will use at most n jobs, and thus any optimal solution can be contracted to a solution taking up at most n integers (by moving back deadlines).

Algorithm

Take the list of jobs and sort them by profit by highest profit first. Now we iterate through these jobs and schedule them at the latest possible time before their deadline where this time is an integer.

Pseudocode


        class ScheduleTracker:
            union_find = UnionFind()

            sets = []

            int earliest_scheduled_time = inf
            schedule = []

            def init(deadlines):
                latest_deadline = max(deadlines)
                for i in range(0, latest_deadline + 1):
                    sets.append(union_find.make_set(i))

            def set_event(time, event):
                union_find.union(sets[time - 1], sets[time])
                earliest_scheduled_time = event.start_time
                schedule.append(event)

            def get_latest_possible_schedule_time(event):
                return union_find.find(event.deadline)

            def is_possible_to_schedule(event):
                return get_latest_possible_schedule_time(event) != 0

        def maximize_profits(events):
            events = events_sorted_non_increasing_by_profit(events)

            schedule_tracker = ScheduleTracker()

            for event in events:
                if schedule_trakcer.is_possible_to_schedule(event):
                    time = schedule_tracker.get_latest_possible_time(event)
                    schedule_tracker.set_event(time, event)

            return schedule_tracker.get_profit()
    

Correctness

We'll start by showing this yields an optimal solution. We'll end with time complexity. The proof of correctness of this algorithm is not trivial, we will employ some additional machinery to make our life easier.

Let J:=(J1,J2,,Jn) be the jobs sorted by non-increasing profits, let Si be the schedule that greedy has come up with after i iterations of the algorithm. Now observe this statement:

αi : "Si can be extended to an optimal schedule by only adding jobs from the set {Jk:i+1kn}".

When i=n in the above statement it reads, Sn can be extended to an optimal schedule only adding jobs from , which means that Sn is optimal. Since after n iterations greedy will terminate, then Sn is what is returned by greedy, so it will return an optimal solution.

Therefore to prove that our algorithm is optimal, we can prove the above statement up to n, we'll use induction for this.

Base Case

We want to show that α0 holds true, that is, we want to show that S0 can be extended to an optimal schedule by only adding jobs from the set {Jk:1kn}. Suppose O is the optimal solution given these jobs, and consider the "floored" optimal solution O, After zero iterations of the greedy algorithm have occurred, nothing yet has happened, therefore nothing is scheduled S0=, then we can trivially just extend this by constructing S0O=O so it is an optimal extension

Induction Step

Now let jN1, and assume that αj holds true, we'd like to show that αj+1 holds true. That is we must prove that Sj+1 can be extended to an optimal schedule by only using jobs from {Jk:j+2kn}. Note that our induction hypothesis assumes that Sj could be extended to an optimal solution only using jobs {Jk:j+1kn}, let's denote this optimal solution by Oj

At this point, we will now do cases based on how our algorithm operates. First of all when we consider the job Jj+1, there are two possibilities, either it's not able to fit into the schedule, or it is. Suppose it cannot be scheduled, then this means Sj+1=Sj because our schedule has stayed the same, but we know that Sj can be extended to Oj, since Sj is the same as Sj+1, then it can also be extended to Oj.

Another option is that Jk+1 is scheduled by greedy. If it is scheduled it is scheduled at some time tj+1 (a natural number). We must also see that Oj either contains Jk+1 or it does not. If Oj does contain Jk+1, then we note that first of all, it is added in a way so that it doesn't collide with anything in Sj because it was an extension, and since it is a solution we know that it scheduled it at a time before or equal to its deadline, let's denote this time as tO

At the same time, greedy also did the same thing, it took Jj+1 and it scheuled it in a way that extended Sj and did not collide with anything previously, and it also scheduled it at a time before or equal to it's deadline, with the one extra assumption that it is scheduled at the latest possible time, this implies that tOtj+1. Quickly see that if tO=tj+1 then they've scheduled the job Jj+1 at the exact same position, and thus can be extended by taking Sj's extension and removing Jj+1 (which is fine because it's already scheduled in Sj+1, in otherwords we have an extension of Sj+1 to an optimal solution using only {Jk:i+2kn}

Now for the final case, we have that Sj+1 did schedule Jj+1 but Oj did not schedule it. Recall that greedy scheduled this job at time tj+1, we'll now show that Oj must have something scheduled at that time, for if it did not, then since Oj is a solution where each event is scheuled on integer values, we know that Oj can fit in job Jj+1 while not overlapping another event, this means that this new solution is still feasable, but it's profit has increased, but Oj was optimal, this is a contradiction, so therefore Oj must have something scheduled at time tj+1.

Suppose Oj has a job Jm scheduled at time tj+1, we note that mj+1 because we're under the assumption that Oj didn't schedule that job. But we know that Jj+1 was not scheuled yet by Sj, this means it was part of the extension to Oj, therefore we have j+1<mn, recall that our jobs J were sorted in non-decreasing order which means that p(Jj+1)p(Jm), let's create a new solution where we take Oj and replace Jm with Jj+1 denote this solution by O, note that p(O)=p(Oj)+p(Jj+1)p(Jm)p(Oj), note that the only way to avoid contradiction is to have that p(Jj+1)=p(Jm), if that's true then O has the same profit as Oj therefore it is optimal. Note that O included Sj+1 and extends it to an optimal solution only using jobs {Jk:j+2kn} which proves what we set out to show.

Schedule Tracker

We also have to prove that ScheduleTracker actually returns correct values as well. Since we've already have a formal proof for why the overall algorithm works, we'll sketch the proof here. Note that is_possible_to_schedule is correct if get_latest_possible_schedule_time is correct.

For convience let L(t) denote: latest time to scehdule a job before or equal to time t. We can see that it must satisfy the following L(t)={t,if no job scheduled at time $t$L(t1),if there is a job scheduled at time $t$ Since our algorithm returns the find of a time, then we want to show that find(t) = L(t) in any situation. That is that the representative of any set is L(t)

We first iterate over all integer times before the last deadline and make a set in the union find with that specific time, this means that each time is it's own representative, we can see that now given some integer time t before any jobs are scheduled, then L(t)=t which is exactly find(t) as needed.

Assume that find(t) = L(t) on a schedule with k jobs scheduled, now we'll show it holds true on a schedule with k+1 jobs scheduled. Suppose that our algorithm has just added one new job J at time t , by the algorithms we only schedule at position that was not already scheduled this means L(t)=t, when there were k jobs our algorithm unions the set t is in S with the one at time t1 called S1 this creates the new set S:=S1S the representative of S is the representative in S1, denote this representative by r

Note that given any time in t, it's representative only changed if it was an element in S. For any one of these times, t used to be their representative, but now their representative is r, we'll prove that r=L(t), which shows the statement. Recall that before the union we used ot have L(t)=t, but now we have a job at time t, therefore now we have that L(t)=L(t1), this can be seen because slot t is filled so we can only choose the latest time coming before it. But t1S1 and so by the induction hypothesis L(t1)= find( t1)) = r as needed.

Therefore we've shown that at any stage, the union-find method actually returns the latest time to schedule a job before or equal to time t.

Runtime

The cost of this alg:

Therefore we can keep this algorithm at nearly O(nlog2n) (because of the ackerman function) instead of O(n2)