Sunday, February 28, 2021

Top "n" using a Priority Queue

If you ever need to capture the smallest or largest "n" from a stream of data, the approach more often than not will be to use a simple data-structure called the Priority Queue. 

Priority Queues do one thing very well - once a bunch of data is added, it can return the lowest value (or the highest value) in constant time. 

How is this useful to answer a top or bottom "n" type question. Let's see. 

Consider this hypothetical stream of data:




And you have to answer the smallest 2 at any point as this stream of data comes in. The trick is to use a Priority Queue 

  • with a size limit of 2
  • which returns the largest of these 2 when asked for (sometimes referred to as a Max Priority Queue)

Two considerations as data streams in:
  • if the size of the priority queue is less than 2 then add the value to the priority queue
  • if the size of the priority queue is equal to 2, then compare the value from the stream with the largest in the queue
    • if less then remove the largest and add the new value
    • if more then do nothing

At the bottom is the state of the Priority Queue as each data in the stream is processed:



See how it always holds the bottom 2. 

Similarly for the largest 3, a Priority Queue with a max capacity of 3 which returns the smallest (referred to as a Min Priority Queue) can be used the following way:
  • if the size of the priority queue is less than 3, then add to the priority queue
  • if the size is equal to 2, then check the value from the stream with the smallest in the queue
    • if more then remove smallest add the value from stream and ignore otherwise



Implementation

Here is a simple kotlin based implementation that uses the built in PriorityQueue in Java standard library.


fun findNSmallestAndLargest(nums: List<Int>, n: Int): Pair<List<Int>, List<Int>> {
    val minFirst: Comparator<Int> = Comparator.naturalOrder<Int>()
    val maxFirst: Comparator<Int> = minFirst.reversed()
    val minPq: PriorityQueue<Int> = PriorityQueue(minFirst)
    val maxPq: PriorityQueue<Int> = PriorityQueue(maxFirst)
    for (num in nums) {
        checkAndAddIfSmallest(maxPq, n, num)
        checkAndAddIfLargest(minPq, n, num)
    }
    return maxPq.toList() to minPq.toList()
}

private fun checkAndAddIfSmallest(maxPq: PriorityQueue<Int>, n: Int, num: Int) {
    if (maxPq.size < n) {
        maxPq.add(n)
    } else if (num < maxPq.peek()) {
        maxPq.poll()
        maxPq.add(num)
    }
}

private fun checkAndAddIfLargest(minPq: PriorityQueue<Int>, n: Int, num: Int) {
    if (minPq.size < n) {
        minPq.add(n)
    } else if (num > minPq.peek()) {
        minPq.poll()
        minPq.add(num)
    }
}

The implementation is very straightforward and follows the outlined algorithm to the letter.