Median of Two Sorted Arrays

Question (LC.4)

Given two sorted arrays, find the median of the these two arrays.

Example

([1,3], [2]) => 2
([1,2], [3,4]) => 2.5
([1,4], [2,3]) => 2.5

Analysis

There two cases 1. the total length is even 2. the total length is odd

Brute Force

The intuitive approach is merging these two arrays. Then compute the median base on the total length. We need O(m+n) time and O(m+n) space for this very brute force approach. We can refine it by counting to the half the total length. This refined brute force will take O((m+n)/2) time and O(1) space.

Better Approach

In order do better than linear time, we need to decompose the search space. Then we need to have two things in hand: 1. define a well-ordered search space 2. create a O(1) validator to reset the interval. If you think really hard on creating a nice search space, then the validator will be easy to create. If you just take the problem space as the search space, then the the creation of the validator requires some thinking.

Binary Search on k

Consider median as a specical case of find the kth largest element. Then the problem becomes (int[] sortedArr1, int[] sortedArr2, int k) => kth largest element We can performa a binary search on k and modify the search space (halve the search target instead of the search space). Distribute the load between two arrays k/2 to arr1 and k/2 to arr2 What if an array is too small? then dump the longer one because we are not sure about the how small/how large are the elements in the smaller array.

public double findMedianSortedArrays(int[] arr1, int[] arr2) {
    int m = arr1.length, n = arr2.length;
    int totalLength = m + n;
    double median;
    if (totalLength % 2 == 0) { // even
        median = 
            (findKthLargest(arr1, arr2, totalLength / 2, 0, 0) + 
             findKthLargest(arr1, arr2, (totalLength / 2) + 1, 0, 0)) / 2.0;
    } else {
        median = findKthLargest(arr1, arr2, (totalLength / 2) + 1, 0, 0);
    }
    return median;
}

// note k is a count instead of an index
public int findKthLargest(int[] arr1, int[] arr2, int k, int p1, int p2) {
    // empty case (the shorter one is all removed)
    if (p1 >= arr1.length) {
        return arr2[p2 + k - 1];
    }
    if (p2 >= arr2.length) {
        return arr1[p1 + k - 1];
    }
    // base case
    if (k == 1) {
        return Math.min(arr1[p1], arr2[p2]);
    }
    // decrease
    int validator1 = Integer.MAX_VALUE;
    int validator2 = Integer.MAX_VALUE;
    if (p1 + k/2 - 1 < arr1.length) {
        validator1 = arr1[p1 + k/2 - 1];
    }
    if (p2 + k/2 - 1 < arr2.length) {
        validator2 = arr2[p2 + k/2 - 1];
    }
    // conquer
    if (validator1 < validator2) {
        return findKthLargest(arr1, arr2, k - k/2, p1 + k/2, p2);
    } else {
        return findKthLargest(arr1, arr2, k - k/2, p1, p2 + k/2);
    }
}

The time complexity is O(logk) = O(log((n+m)/2)) = O(log(n+m)). Since the deleting operation only involves pointer movements, the space complexity is O(1).

Median Validator

The other way is separating arr1 and arr2 into 2 sets that satisfy the definition of the median. The validator will take a long time to come up with to handle all the possible cases.

Last updated