排序算法

简单排序

冒包排序

冒泡排序第一趟找出最小的元素,第二趟找出剩下最小的元素,依次类推。

def bubble_sort(A):
    n = len(A)
    if n < 2:
        return
    for i in range(n - 1):
        for j in range(i+1, n):
            if A[i] > A[j]:
                A[i], A[j] = A[j], A[i]
    return A

print bubble_sort([8, 94, 45, 63, 7, 7, 31])
[7, 7, 8, 31, 45, 63, 94]

插入排序

将当前元素插入到前面已经排序的合适位置,在算法设计的时候可以将前面比当前大的元素依次后移,再把当前元素插入正确的空位中。

insert-sort.png

def insert_sort(A):
    n = len(A)
    if n < 2:
        return
    for i in range(n):
        cur, j = A[i], i                # move cur to correct pos
        while j > 0:
            if cur < A[j-1]:
                A[j] = A[j-1]
            else:
                break
            j -= 1
        A[j] = cur                      # j is the correct pos
    return A
print insert_sort([8, 94, 45, 63, 7, 7, 31])
[7, 7, 8, 31, 45, 63, 94]

经典排序

归并排序

归并排序需要两步工作,第一步是划分未排序,第二步是合并已排序。

def merge_sorted(A, start, mid, end):
    a, b, i, j = A[start:mid], A[mid:end], 0, 0
    for k in range(start, end):
        if ((i == mid - start) or               # a is merged
            ((j < end - mid) and a[i] > b[j])): # merge smaller b[j]
            A[k] = b[j]
            j += 1
        else:                                   # merge smaller a[i]
            A[k] = a[i]
            i += 1

def do_merge_sort(A, start, end):
    if end - start < 2:
        return
    mid = (start + end) / 2
    do_merge_sort(A, start, mid)        # split
    do_merge_sort(A, mid, end)          # split
    merge_sorted(A, start, mid, end)    # merge

def merge_sort(A):
    do_merge_sort(A, 0, len(A))
    return A

print merge_sort([8, 94, 45, 63, 7, 7, 31])
[7, 7, 8, 31, 45, 63, 94]

堆排序

堆排序是设计非常巧妙的排序,堆是一个数组,展开为树结构之后保证上一层元素比下一层元素大。

堆排序最巧妙的函数是max_heapify(),需要注意这是一个递归函数,它将一个不合法的值插入到根节点,前提是左右子堆都是合法的堆,通过递归交换,把不合法的值传递到合法的位置,时间复杂度为O(lgn)。这个函数被两次利用,建堆的时候用一次,排序的时候调用一次。建堆时从下往上构建,排序时从上往下取出最大值。

heap-arrary.png

def left(i):
    return 2*i + 1
def right(i):
    return 2*i + 2

def max_heapify(A, i, size):            # O(lgn)
    maxi, l, r = i, left(i), right(i)
    if l < size and A[l] > A[maxi]:     # find max idx
        maxi = l
    if r < size and A[r] > A[maxi]:     # find max idx
        maxi = r
    if maxi != i:
        A[i], A[maxi] = A[maxi], A[i]   # A[i] set to be max
        max_heapify(A, maxi, size)      # heapify sub-heap

def heap_sort(A):
    size = len(A)
    if size < 2:
        return A
    for i in range(size/2, -1, -1):     # build heap
        max_heapify(A, i, size)
    for i in range(size - 1, 0, -1):
        A[i], A[0] = A[0], A[i]         # set A[i] to max value
        max_heapify(A, 0, i)            # heapify left unsorted array
    return A

print heap_sort([8, 94, 45, 63, 7, 7, 31])
[7, 7, 8, 31, 45, 63, 94]

快速排序

快速排序的思路还是比较清晰的,将序列分成两半,一半比指定值小,一半比指定值大,递归分割完成之后就排好序了。

def simple_partition(A, start, end):
    if end - start < 2:
        return start
    i, x = start, A[end-1]
    for j in range(start, end):
        if A[j] < x:                    # find a < x
            A[i], A[j] = A[j], A[i]     # move a to left
            i += 1
    A[i], A[end-1] = A[end-1], A[i]     # set mid value
    return i                            # get mid index

def hoare_partition(A, start, end):
    if end - start < 2:
        return start
    x, i, j = A[start], start, end-1
    while i < j:
        while i < j and A[j] > x:       # find a <= x from right
            j -= 1
        while i < j and A[i] <= x:      # find b > x from left
            i += 1
        if i < j:
            A[i], A[j] = A[j], A[i]     # move a to the left of b
    A[start], A[j] = A[j], A[start]     # set mid value
    return j                            # get mid index

def do_quick_sort(A, start, end, pfunc):
    if end - start < 2:
        return
    mid = pfunc(A, start, end)
    do_quick_sort(A, start, mid, pfunc) # sort A[start:mid]
    do_quick_sort(A, mid+1, end, pfunc) # sort A[mid+1:end]

def quick_sort(A, pfunc):
    do_quick_sort(A, 0, len(A), pfunc)
    return A

print quick_sort([8, 94, 45, 63, 7, 7, 31], simple_partition)
print quick_sort([8, 94, 45, 63, 7, 7, 31], hoare_partition)
[7, 7, 8, 31, 45, 63, 94]
[7, 7, 8, 31, 45, 63, 94]

特殊排序

计数排序

计数排序的思路是非常简单的,如果我们知道小于或等于当前元素的个数,那么就可以直接将当前元素放到输出序列的对应位置。

这种排序算法需要限制输入序列值的范围,并且只能处理整数。

def count_sort(A):
    minx = min(A)
    A = [x-minx for x in A]
    n, k = len(A), max(A) + 1
    B, C = [0] * n, [0] * k
    for i in range(n):
        C[A[i]] = C[A[i]] + 1           # C[i] == counts(val == i)
    for i in range(1, k):
        C[i] += C[i-1]                  # C[i] == counts(val <= i)
    for i in range(n - 1, -1, -1):
        B[C[A[i]]-1] = A[i]
        C[A[i]] -= 1
    A = [x+minx for x in B]
    return A

print count_sort([8, 94, 45, 63, 7, 7, 31])
[7, 7, 8, 31, 45, 63, 94]