Motivation

Before getting a solution to this question, the fact that the due dates are abitrary real values seems to throw a wrench in the gears. We can think about this question clearly when the due dates are natural numbers because then we know that we can place our events exactly on integers, otherwise we could place them in any position, this greatly complicates things. To deal with this we will note the following:

Suppose that we have any optimal solution O = ( o 1 , o 2 , o m ) , we claim that a new solution can be produced which has the same profit. This new solution is one where the solution only schedules at integer values, less than or equal to their original start time.

We can prove this true with induction using the following claim "We can edit the original optimal solution up to index i by moving events leftward and snapping them to integer values while maintaining optimality". We'll sketch the proof, in the base case we know that we can move our first scheduled event all the way to 0 , this is because 0 is an integer less than or equal to all start times, also moving this event backward will never collide with anything else because there are no events before it, it also maintains the same profit since o 1 is still scheduled before it's due date.

In the induction step we assume that we can do this up to event k , now we note that the finish time of event k is at some integer j , it's not hard to see that for event k + 1 it can be slid back to start at time j or any integer between j and it's original start time, because it will have nothing to collide with there and it maintained it's deadline.

Therefore we know that given any sequence of deadlines, we have some optimal solutions which all yield the same profit, we can then select one of them and "floor" it. Notationally, we'll denote this optimal solution by O . The reason why this is good is that we can try to come up with an algorithm that schedules events at integer values and it's still possible for this algorithm to be optimal.

We will also note that for any optimal solution it will use at most n jobs, and thus any optimal solution can be contracted to a solution taking up at most n integers (by moving back deadlines).

Algorithm

Take the list of jobs and sort them by profit by highest profit first. Now we iterate through these jobs and schedule them at the latest possible time before their deadline where this time is an integer.

Pseudocode

        class ScheduleTracker:
            union_find = UnionFind()

            sets = []

            int earliest_scheduled_time = inf
            schedule = []

            def init(deadlines):
                latest_deadline = max(deadlines)
                for i in range(0, latest_deadline + 1):
                    sets.append(union_find.make_set(i))

            def set_event(time, event):
                union_find.union(sets[time - 1], sets[time])
                earliest_scheduled_time = event.start_time
                schedule.append(event)

            def get_latest_possible_schedule_time(event):
                return union_find.find(event.deadline)

            def is_possible_to_schedule(event):
                return get_latest_possible_schedule_time(event) != 0

        def maximize_profits(events):
            events = events_sorted_non_increasing_by_profit(events)

            schedule_tracker = ScheduleTracker()

            for event in events:
                if schedule_trakcer.is_possible_to_schedule(event):
                    time = schedule_tracker.get_latest_possible_time(event)
                    schedule_tracker.set_event(time, event)

            return schedule_tracker.get_profit()
    

Correctness

We'll start by showing this yields an optimal solution. We'll end with time complexity. The proof of correctness of this algorithm is not trivial, we will employ some additional machinery to make our life easier.

Let J := ( J 1 , J 2 , , J n ) be the jobs sorted by non-increasing profits, let S i be the schedule that greedy has come up with after i iterations of the algorithm. Now observe this statement:

α i : " S i can be extended to an optimal schedule by only adding jobs from the set { J k : i + 1 k n } ".

When i = n in the above statement it reads, S n can be extended to an optimal schedule only adding jobs from , which means that S n is optimal. Since after n iterations greedy will terminate, then S n is what is returned by greedy, so it will return an optimal solution.

Therefore to prove that our algorithm is optimal, we can prove the above statement up to n , we'll use induction for this.

Base Case

We want to show that α 0 holds true, that is, we want to show that S 0 can be extended to an optimal schedule by only adding jobs from the set { J k : 1 k n } . Suppose O is the optimal solution given these jobs, and consider the "floored" optimal solution O , After zero iterations of the greedy algorithm have occurred, nothing yet has happened, therefore nothing is scheduled S 0 = , then we can trivially just extend this by constructing S 0 O = O so it is an optimal extension

Induction Step

Now let j N 1 , and assume that α j holds true, we'd like to show that α j + 1 holds true. That is we must prove that S j + 1 can be extended to an optimal schedule by only using jobs from { J k : j + 2 k n } . Note that our induction hypothesis assumes that S j could be extended to an optimal solution only using jobs { J k : j + 1 k n } , let's denote this optimal solution by O j

At this point, we will now do cases based on how our algorithm operates. First of all when we consider the job J j + 1 , there are two possibilities, either it's not able to fit into the schedule, or it is. Suppose it cannot be scheduled, then this means S j + 1 = S j because our schedule has stayed the same, but we know that S j can be extended to O j , since S j is the same as S j + 1 , then it can also be extended to O j .

Another option is that J k + 1 is scheduled by greedy. If it is scheduled it is scheduled at some time t j + 1 (a natural number). We must also see that O j either contains J k + 1 or it does not. If O j does contain J k + 1 , then we note that first of all, it is added in a way so that it doesn't collide with anything in S j because it was an extension, and since it is a solution we know that it scheduled it at a time before or equal to its deadline, let's denote this time as t O

At the same time, greedy also did the same thing, it took J j + 1 and it scheuled it in a way that extended S j and did not collide with anything previously, and it also scheduled it at a time before or equal to it's deadline, with the one extra assumption that it is scheduled at the latest possible time, this implies that t O t j + 1 . Quickly see that if t O = t j + 1 then they've scheduled the job J j + 1 at the exact same position, and thus can be extended by taking S j 's extension and removing J j + 1 (which is fine because it's already scheduled in S j + 1 , in otherwords we have an extension of S j + 1 to an optimal solution using only { J k : i + 2 k n }

Now for the final case, we have that S j + 1 did schedule J j + 1 but O j did not schedule it. Recall that greedy scheduled this job at time t j + 1 , we'll now show that O j must have something scheduled at that time, for if it did not, then since O j is a solution where each event is scheuled on integer values, we know that O j can fit in job J j + 1 while not overlapping another event, this means that this new solution is still feasable, but it's profit has increased, but O j was optimal, this is a contradiction, so therefore O j must have something scheduled at time t j + 1 .

Suppose O j has a job J m scheduled at time t j + 1 , we note that m j + 1 because we're under the assumption that O j didn't schedule that job. But we know that J j + 1 was not scheuled yet by S j , this means it was part of the extension to O j , therefore we have j + 1 < m n , recall that our jobs J were sorted in non-decreasing order which means that p ( J j + 1 ) p ( J m ) , let's create a new solution where we take O j and replace J m with J j + 1 denote this solution by O , note that p ( O ) = p ( O j ) + p ( J j + 1 ) p ( J m ) p ( O j ) , note that the only way to avoid contradiction is to have that p ( J j + 1 ) = p ( J m ) , if that's true then O has the same profit as O j therefore it is optimal. Note that O included S j + 1 and extends it to an optimal solution only using jobs { J k : j + 2 k n } which proves what we set out to show.

Schedule Tracker

We also have to prove that ScheduleTracker actually returns correct values as well. Since we've already have a formal proof for why the overall algorithm works, we'll sketch the proof here. Note that is_possible_to_schedule is correct if get_latest_possible_schedule_time is correct.

For convience let L ( t ) denote: latest time to scehdule a job before or equal to time t . We can see that it must satisfy the following L ( t ) = { t , if no job scheduled at time  t L ( t 1 ) , if there is a job scheduled at time  t Since our algorithm returns the find of a time, then we want to show that find(t) = L ( t ) in any situation. That is that the representative of any set is L ( t )

We first iterate over all integer times before the last deadline and make a set in the union find with that specific time, this means that each time is it's own representative, we can see that now given some integer time t before any jobs are scheduled, then L ( t ) = t which is exactly find(t) as needed.

Assume that find(t) = L ( t ) on a schedule with k jobs scheduled, now we'll show it holds true on a schedule with k + 1 jobs scheduled. Suppose that our algorithm has just added one new job J at time t , by the algorithms we only schedule at position that was not already scheduled this means L ( t ) = t , when there were k jobs our algorithm unions the set t is in S with the one at time t 1 called S 1 this creates the new set S := S 1 S the representative of S is the representative in S 1 , denote this representative by r

Note that given any time in t , it's representative only changed if it was an element in S . For any one of these times, t used to be their representative, but now their representative is r , we'll prove that r = L ( t ) , which shows the statement. Recall that before the union we used ot have L ( t ) = t , but now we have a job at time t , therefore now we have that L ( t ) = L ( t 1 ) , this can be seen because slot t is filled so we can only choose the latest time coming before it. But t 1 S 1 and so by the induction hypothesis L ( t 1 ) = find( t 1 ) ) = r as needed.

Therefore we've shown that at any stage, the union-find method actually returns the latest time to schedule a job before or equal to time t .

Runtime

The cost of this alg:

Therefore we can keep this algorithm at nearly O ( n log 2 n ) (because of the ackerman function) instead of O ( n 2 )