AdSense

Saturday, August 15, 2015

Merge K sorted linked lists/arrays revisited

K sorted arrays

I got a comment on my previous post Merge K sorted arrays, and I realized that there are much easier (and shorter) solutions to solve this problem. Here is the code snippet in both Java and Python.

Java:

public static List<Integer> merged (List<int[]> arrays){
  List<Integer> mergedArrays = new ArrayList<Integer> ();
  if (arrays == null || arrays.size() == 0){
   return mergedArrays;
  }
  for (int[] array : arrays){
   for (int i : array)
    mergedArrays.add(i);
  }
  Collections.sort(mergedArrays);
  return mergedArrays;
 }

Python:


def merged(lists):
    if lists is None or len(lists) == 0:
        return None
    mergedArray = [n for array in lists for n in array]
    return sorted(mergedArray)

Yeah, you can't disagree that the Pythonic way sometimes make things too easy to believe.

Why does this work:
Since in any case, we have to traverse the whole every element in the list of arrays, the complexity is O (mn), where m is the length of the list and n is the average length of each array.  Collections.sort() implements Timsort (see here, here and here) which on average has complexity O(nlogn). The time complexity is the same as using a priority queue, but with much shorter code.


K sorted lists

This gives me a reason to rethink the merge K sorted linked lists problem.
Unfortunately, since we need to return a linked list, we cannot use any built-in sorting methods.
I rewrite the code using Python. In Python, the built-in heapq only provides sorting based on natural order. Since the queue can sort tuples based on its first element, one solution could be wrapping the elements to be sorted to tuples (priority, element). See the code for detail[1].

import heapq
from mergeAndSort.ListNode import ListNode

class PriorityQueue(object):
    def __init__(self, initial = None, key=lambda x:x):
        self.key = key
        if initial:
            self._data = [(key(item), item) for item in initial]
            heapq.heapify(self._data)
        else:
            self._data = []

    def push(self, item):
        heapq.heappush(self._data, (self.key(item), item))

    def pop(self):
        return heapq.heappop(self._data)[1]

    def __len__(self):
        return len(self._data)

    def empty(self):
        return len(self._data) == 0

class ListNode(object):
    def __init__(self, val):
        self.val = val
        self.next = None

def mergeKLists(lists):
    if lists is None or len(lists) == 0:
            return None
    pq = PriorityQueue(key=lambda x:x.val)
    for n in lists:
        if n is not None:
            pq.push(n)
    head = ListNode(-1)
    h = head
    while not pq.empty():
        n = pq.pop()
        head.next = n
        head = head.next
        if n.next is not None:
            pq.push(n.next)
    return h.next

Comparison
Based on Leetcode's compiler, this Python implementation is not faster than the Java one. I'm sure there should be better solutions, please let me know.

Reference:
[1] http://stackoverflow.com/questions/8875706/python-heapq-with-custom-compare-predicate

1 comment:

  1. try heap sort during merge, the complexity could be reduced to mlgk.

    ReplyDelete