likes
comments
collection
share

最小堆算法介绍及应用

作者站长头像
站长
· 阅读数 20

最小堆是一种特殊的二叉堆,它满足以下性质:

  1. 每个节点的值都小于或等于其子节点的值。
  2. 根节点的值是堆中的最小值。

最小堆的优点:

  1. 快速找到最小值:由于根节点是最小值,所以可以在O(1)的时间复杂度内找到最小值。
  2. 快速插入和删除:插入和删除操作的时间复杂度为O(log n),其中n是堆中元素的数量。
  3. 可以用于实现优先队列:最小堆可以用于实现优先队列,其中元素按照优先级进行排序。

最小堆的缺点:

  1. 不支持快速找到最大值:由于最小堆的性质,它不支持快速找到最大值。如果需要找到最大值,需要遍历整个堆。
  2. 不支持快速删除任意元素:最小堆只支持删除根节点,如果需要删除其他节点,需要先找到该节点,然后再进行删除操作,时间复杂度为O(log n)。
  3. 不支持快速更新元素:如果需要更新堆中的某个元素,需要先找到该元素,然后进行更新操作,时间复杂度为O(n)。

总结: 最小堆适用于需要快速找到最小值和支持快速插入、删除操作的场景,但不适用于需要快速找到最大值、快速删除任意元素或快速更新元素的场景。

下面是一个使用图表解释最小堆算法的示例:

  1. 初始堆:

        5
      /   \
     8     10
    / \   /  \
    12  15 17  20
    
    
  2. 插入元素13:

        5
      /   \
     8     10
    / \   /  \
    12  15 17  20
    /
    13
    
    

    插入元素13后,根节点的值仍然是最小值。

  3. 删除根节点:

        8
      /   \
     12    10
    / \   /  \
    13  15 17  20
    
    

    删除根节点后,新的根节点8是堆中的最小值。

  4. 插入元素7:

        7
      /   \
     8     10
    / \   /  \
    12  15 17  20
    /
    13
    
    

    插入元素7后,根节点的值仍然是最小值。

通过不断插入和删除操作,最小堆可以保持根节点的值始终是最小值。这使得最小堆在优先队列、堆排序等算法中有着广泛的应用。

下面通过一个简单的例子看学习一下

package main

import (
    "fmt"
)

type MinHeap struct {
    arr []int
}

func NewMinHeap() *MinHeap {
    return &MinHeap{
        arr: []int{},
    }
}

func (h *MinHeap) Insert(val int) {
    h.arr = append(h.arr, val)
    h.heapifyUp(len(h.arr) - 1)
}

func (h *MinHeap) ExtractMin() (int, error) {
    if len(h.arr) == 0 {
        return 0, fmt.Errorf("Heap is empty")
    }

    min := h.arr[0]
    last := len(h.arr) - 1
    h.arr[0] = h.arr[last]
    h.arr = h.arr[:last]
    h.heapifyDown(0)

    return min, nil
}

func (h *MinHeap) heapifyUp(index int) {
    parent := (index - 1) / 2

    if index == 0 || h.arr[index] >= h.arr[parent] {
        return
    }

    h.arr[index], h.arr[parent] = h.arr[parent], h.arr[index]
    h.heapifyUp(parent)
}

func (h *MinHeap) heapifyDown(index int) {
    left := 2*index + 1
    right := 2*index + 2
    smallest := index

    if left < len(h.arr) && h.arr[left] < h.arr[smallest] {
        smallest = left
    }

    if right < len(h.arr) && h.arr[right] < h.arr[smallest] {
        smallest = right
    }

    if smallest != index {
        h.arr[index], h.arr[smallest] = h.arr[smallest], h.arr[index]
        h.heapifyDown(smallest)
    }
}

func main() {
    h := NewMinHeap()

    h.Insert(5)
    h.Insert(3)
    h.Insert(8)
    h.Insert(1)
    h.Insert(10)

    min, err := h.ExtractMin()
    if err != nil {
        fmt.Println(err)
    } else {
        fmt.Println("Extracted min:", min)
    }

    min, err = h.ExtractMin()
    if err != nil {
        fmt.Println(err)
    } else {
        fmt.Println("Extracted min:", min)
    }
}

输出

Extracted min: 1
Extracted min: 3

在实际的项目中,我们也能看到最小堆的身影,比如kubenetes client-go项目中,实现的延时队列功能时的应用


// waitForPriorityQueue implements a priority queue for waitFor items.
//
// waitForPriorityQueue implements heap.Interface. The item occurring next in
// time (i.e., the item with the smallest readyAt) is at the root (index 0).
// Peek returns this minimum item at index 0. Pop returns the minimum item after
// it has been removed from the queue and placed at index Len()-1 by
// container/heap. Push adds an item at index Len(), and container/heap
// percolates it into the correct location.
type waitForPriorityQueue []*waitFor

func (pq waitForPriorityQueue) Len() int {
	return len(pq)
}
func (pq waitForPriorityQueue) Less(i, j int) bool {
	return pq[i].readyAt.Before(pq[j].readyAt)
}
func (pq waitForPriorityQueue) Swap(i, j int) {
	pq[i], pq[j] = pq[j], pq[i]
	pq[i].index = i
	pq[j].index = j
}

// Push adds an item to the queue. Push should not be called directly; instead,
// use `heap.Push`.
func (pq *waitForPriorityQueue) Push(x interface{}) {
	n := len(*pq)
	item := x.(*waitFor)
	item.index = n
	*pq = append(*pq, item)
}

// Pop removes an item from the queue. Pop should not be called directly;
// instead, use `heap.Pop`.
func (pq *waitForPriorityQueue) Pop() interface{} {
	n := len(*pq)
	item := (*pq)[n-1]
	item.index = -1
	*pq = (*pq)[0:(n - 1)]
	return item
}

// Peek returns the item at the beginning of the queue, without removing the
// item or otherwise mutating the queue. It is safe to call directly.
func (pq waitForPriorityQueue) Peek() interface{} {
	return pq[0]
}

看到这里,可能会有点懵,这怎么就是最小堆了呢


// maxWait keeps a max bound on the wait time. It's just insurance against weird things happening.
// Checking the queue every 10 seconds isn't expensive and we know that we'll never end up with an
// expired item sitting for more than 10 seconds.
const maxWait = 10 * time.Second

// waitingLoop runs until the workqueue is shutdown and keeps a check on the list of items to be added.
func (q *delayingType) waitingLoop() {
	defer utilruntime.HandleCrash()

	// Make a placeholder channel to use when there are no items in our list
	never := make(<-chan time.Time)

	// Make a timer that expires when the item at the head of the waiting queue is ready
	var nextReadyAtTimer clock.Timer

	waitingForQueue := &waitForPriorityQueue{}
	heap.Init(waitingForQueue)

	waitingEntryByData := map[t]*waitFor{}

	for {
		if q.Interface.ShuttingDown() {
			return
		}

		now := q.clock.Now()

		// Add ready entries
		for waitingForQueue.Len() > 0 {
			entry := waitingForQueue.Peek().(*waitFor)
			if entry.readyAt.After(now) {
				break
			}

			entry = heap.Pop(waitingForQueue).(*waitFor)
			q.Add(entry.data)
			delete(waitingEntryByData, entry.data)
		}

		// Set up a wait for the first item's readyAt (if one exists)
		nextReadyAt := never
		if waitingForQueue.Len() > 0 {
			if nextReadyAtTimer != nil {
				nextReadyAtTimer.Stop()
			}
			entry := waitingForQueue.Peek().(*waitFor)
			nextReadyAtTimer = q.clock.NewTimer(entry.readyAt.Sub(now))
			nextReadyAt = nextReadyAtTimer.C()
		}

		select {
		case <-q.stopCh:
			return

		case <-q.heartbeat.C():
			// continue the loop, which will add ready items

		case <-nextReadyAt:
			// continue the loop, which will add ready items

		case waitEntry := <-q.waitingForAddCh:
			if waitEntry.readyAt.After(q.clock.Now()) {
				insert(waitingForQueue, waitingEntryByData, waitEntry)
			} else {
				q.Add(waitEntry.data)
			}

			drained := false
			for !drained {
				select {
				case waitEntry := <-q.waitingForAddCh:
					if waitEntry.readyAt.After(q.clock.Now()) {
						insert(waitingForQueue, waitingEntryByData, waitEntry)
					} else {
						q.Add(waitEntry.data)
					}
				default:
					drained = true
				}
			}
		}
	}
}

// insert adds the entry to the priority queue, or updates the readyAt if it already exists in the queue
func insert(q *waitForPriorityQueue, knownEntries map[t]*waitFor, entry *waitFor) {
	// if the entry already exists, update the time only if it would cause the item to be queued sooner
	existing, exists := knownEntries[entry.data]
	if exists {
		if existing.readyAt.After(entry.readyAt) {
			existing.readyAt = entry.readyAt
			heap.Fix(q, existing.index)
		}

		return
	}

	heap.Push(q, entry)
	knownEntries[entry.data] = entry
}

这就涉及到golang中的堆,以及如何让自定义的结构体类型转化为堆,首先: 必须实现heap的接口

// Note that Push and Pop in this interface are for package heap's
// implementation to call. To add and remove things from the heap,
// use heap.Push and heap.Pop.
type Interface interface {
	sort.Interface
	Push(x any) // add x as element Len()
	Pop() any   // remove and return element Len() - 1.
}

其次,必须调用Init方法,将自定义的结构,初始化heap,然后就可以使用heap.Push和heap.Pop方法进行操作堆了,在节点的优先级变更后,使用Fix方法进行重建堆。详细的方法如下:


// The Interface type describes the requirements
// for a type using the routines in this package.
// Any type that implements it may be used as a
// min-heap with the following invariants (established after
// Init has been called or if the data is empty or sorted):
//
//	!h.Less(j, i) for 0 <= i < h.Len() and 2*i+1 <= j <= 2*i+2 and j < h.Len()
//
// Note that Push and Pop in this interface are for package heap's
// implementation to call. To add and remove things from the heap,
// use heap.Push and heap.Pop.
type Interface interface {
	sort.Interface
	Push(x any) // add x as element Len()
	Pop() any   // remove and return element Len() - 1.
}

// Init establishes the heap invariants required by the other routines in this package.
// Init is idempotent with respect to the heap invariants
// and may be called whenever the heap invariants may have been invalidated.
// The complexity is O(n) where n = h.Len().
func Init(h Interface) {
	// heapify
	n := h.Len()
	for i := n/2 - 1; i >= 0; i-- {
		down(h, i, n)
	}
}

// Push pushes the element x onto the heap.
// The complexity is O(log n) where n = h.Len().
func Push(h Interface, x any) {
	h.Push(x)
	up(h, h.Len()-1)
}

// Pop removes and returns the minimum element (according to Less) from the heap.
// The complexity is O(log n) where n = h.Len().
// Pop is equivalent to Remove(h, 0).
func Pop(h Interface) any {
	n := h.Len() - 1
	h.Swap(0, n)
	down(h, 0, n)
	return h.Pop()
}

// Remove removes and returns the element at index i from the heap.
// The complexity is O(log n) where n = h.Len().
func Remove(h Interface, i int) any {
	n := h.Len() - 1
	if n != i {
		h.Swap(i, n)
		if !down(h, i, n) {
			up(h, i)
		}
	}
	return h.Pop()
}

// Fix re-establishes the heap ordering after the element at index i has changed its value.
// Changing the value of the element at index i and then calling Fix is equivalent to,
// but less expensive than, calling Remove(h, i) followed by a Push of the new value.
// The complexity is O(log n) where n = h.Len().
func Fix(h Interface, i int) {
	if !down(h, i, h.Len()) {
		up(h, i)
	}
}

func up(h Interface, j int) {
	for {
		i := (j - 1) / 2 // parent
		if i == j || !h.Less(j, i) {
			break
		}
		h.Swap(i, j)
		j = i
	}
}

func down(h Interface, i0, n int) bool {
	i := i0
	for {
		j1 := 2*i + 1
		if j1 >= n || j1 < 0 { // j1 < 0 after int overflow
			break
		}
		j := j1 // left child
		if j2 := j1 + 1; j2 < n && h.Less(j2, j1) {
			j = j2 // = 2*i + 2  // right child
		}
		if !h.Less(j, i) {
			break
		}
		h.Swap(i, j)
		i = j
	}
	return i > i0
}

这up和down的实现,和我们自己的实现,是不是如出一辙