最新评论 (0)


Data Structures: Heaps

大家好 我是 Gayle Laakmann McDowell 《程序员面试金典》的作者
Hi, I’m Gayle Laakmann McDowell, author of Cracking the Coding Interview. Today we’ll talk
about a topic that a lot of candidates forget about, heaps. Heaps come in one of
堆有两种形式 最小堆和最大堆 今天我们只讲最小堆
two forms, a min heap or max heap. We’ll just focus on min heaps today because a
因为最大堆基本上就是最小堆反过来 在最小堆中 所有元素都会小于
max heap is essentially the reverse. In a min heap the elements are all smaller than
它们的子元素 所以根节点就会是最小的元素了
their children so the root node will be the very smallest element and then
从根节点沿着堆的分支往下 元素会
looking down the tree down the heap, the elements get bigger and bigger and
这就是堆的基本定义 但是我们如何创建并维护堆
So that’s the basics of what a heap is but how do we actually create and maintain
这种数据结构呢?让我们先从插入元素开始 当我们插入元素时
such a data structure? So let’s start with just insertion. So when we insert an
元素总会沿着从上往下 从左往右的顺序 插入下一个空的位置
element it always goes in the next empty spot looking top to bottom left to right.
就像这样 首先我们在这插入一个元素 然后在这里再插入一个 再插入一个
So we go, first we insert an element here, and then here and then here and then
一直沿着堆型结构插入 这就是元素插入的方式了
here and so on through the tree, through the heap. So that’s how insertion works.
当然你会问 如果元素没插入到正确的位置
But then of course what happens if that’s not really where the element
should go?
我们可以先插入元素 然后用冒泡法
What we can do is we can insert the element there and then bubble it up
找到正确的位置 也就是我们拿着要插入的元素 然后
until we get to the right spot. So we take the inserted element, we compare it
和它的父元素比较 如果顺序不对 那就交换它们的位置 然后继续沿着树
with its parent, if it’s out of order, swap them and then keep going up the
向上比较 那如果我们直接去掉最小元素会怎么样呢?
tree in this process. Now what about removing the minimum element? So
我们知道最小元素肯定会在根节点上 所以
we know the minimum element will always be the root node and so that’s easy to
找到它还是很容易的 如果移除它 那我们就会得到一个空的点
find but then if we want to remove it we might have an empty spot. So what we
所以当我们移除了最小元素后怎么做呢 我们先去掉根节点上的元素
do here is we remove the min element there, so we take out the root and then
然后将最后插入的元素的值放到根节点 当然
we swap that value at the root with the last element added. And then of course
新放进来的元素未必是最小的元素 所以我们拿着这个新根节点的元素 用冒泡法
that element might not be in the right spot, so we take the root element and bubble
和它的子元素做比较 它左边的子元素
it down to the next spot so we compare the root with its children, its left
然后它右边的子元素 然后把它和这两个子元素中较小的那个交换
child and its right child, and then swap it with the smaller of the two. And then we
keep going down the tree until the heap property is restored. So that’s how a
这就是树操作 这就是堆操作 让我们试试怎么实现
tree operates, let’s think about implement- that’s how a heap operates, let’s talk
about implementation now. So implementation is kind of interesting.
You might have assumed that we’d implement it a simple class node with a
简单节点类来实现它 当然我们确实可以用这种方法
left node and a right node, and certainly we could do it that way. But
there’s an even better way of implementing it.
要知道当我们在堆中增加元素时 元素总是会被放在
Note that when we add elements to the heap they’re always getting added in a
某个位置上 中间不会有什么分隔
very particular spot. There aren’t gonna be any gaps in the heap so we have the
所以我们这里有0号点 然后1 2 3 4号节点一直这么下去
zeroth element here and then the first second third fourth etc and so that
means that we can actually use an array instead to store these values and
这样可以让数据非常紧凑 这样一个简单的式子就可以
that makes it very compact. And a simple simple equation can map from an index to
通过索引找到它左边的或右边的子元素 或者它的父元素
its left child its right child or to its parent. And so we can still get to
这样我们仍然可以得到它的左右子元素 却不用付出一个节点类的开销
the left and right child but we don’t need to have this overhead of a node
好了 我们已经大概讲完了有关堆的基本概念 接下来
class. So now that we’ve covered the basics of what a heap is let’s turn to
让我们看看代码怎么写 我们用一个简单类来实现最小堆 这个简单类里包含了数组
the actual code for this. We’ll implement min heap with a simple class that wraps
所以这会是一个定长的数组 不过如果数组太大的话
this items array. This is going to be an array of a fixed length but if it
我们再增加容量就行 接下来在开始之前
gets too big we’ll increase the capacity. Now I’m gonna get a little bit of a head
我要稍微取巧一下 先添加一些简单的
start and cheat a little bit by just adding a whole bunch of simple
帮手方法 这是一些简单的方法 可以用来通过父元素
helper methods. So these are just simple methods that get the left and right
获取子元素 这些是用来检查它们是否存在
child of the parent index. Actually you know check if they exist, or
这些是用来获取它们的值 这里我就先提前做了这些工作
actually get the values themselves. So I’m just getting a little bit of a head
然后我要再做一些提前量 就是额外添加两个方法
start here and I’ll get another little bit of a head start by adding in two extra
methods here. One is a swap method that swaps the values of two indices and
另一个是增加数组容量的方法 这个方法要做的就是检查
another one is an insure extra capacity method. Now what this does is it checks if the
数组是否已经满了 如果满了 就新建一个两倍大小的数组
array is full and if so, it creates a new array of double that size and it copies
然后将前一个数组的元素全部复制过去 这其实就是数组列表的工作方式
all the elements over. And this by the way is the basics of how an arraylist
好的 让我们开始看代码吧 我要应用的首个方法
operates. Now let’s turn to the real code. The first method I’ll implement
就是 peek 方法
is a peek method and this
它会先检查数组是否为空 如果为空则返回异常 因为这意味着
first checks if the array is empty if so returns an exception because there’s nothing
前面没有任何元素 如果不为空 则返回数组的第一个元素
at the front. Otherwise it just returns the first element in the array which
will always be the minimum element, and essentially the root of the heap. The next method we’ll
第二个要应用 poll 方法 这个方法会提取出最小元素
do is a poll method. Now what this does is actually extract the minimum element
然后将元素从数组中移除 所以首先我们要检查
and actually so actually removes it from the array. So first we’ll check if the arrays
数组是否为空 如果为空就抛出异常
empty, if so throw an exception.
不然我就要获取它的值 所以位置就是0 那我
Otherwise I need to actually get the value so item is item of 0 then I need to
take the very last element in the array and move it into the very first element
then I need to shrink my array.
或者缩小它的大小 然后将它变成堆的结构
Or shrink essentiall the size of it. And then I need to go and actually re-heapify.
所以我要移除根元素 然后向下堆化
So I removed the root element so I need to heapify down and I’ll go fill
this in, this method in shortly.
在这里 如果我移除最小元素10 删掉它
So in this case if we remove the ten, the minimum element, it’s going to get
那么 17就要移到之前10的位置
deleted. Then the 17 is going to get moved up to where the ten is, so seeing
看看这个数组 就是这样
the array, that’s like this,
把17放在这 然后我们要调整堆里的元素
the 17 gets put in here. And then we go and adjust the heap to shift elements
down as needed.
My next method is going to actually add an element in here. So first thing I want
首先我要确保容量是够的 所以我要调用
to do is I wanna make sure there is capacity, so I’ll call my insure
这个增加容量方法 然后我要把新元素放在最后的位置
extra capacity method. Then I’m gonna add my element into the very last spot so
所以这里 item[size] = item 然后增大数组的大小
items of size equals this new item then increase my size. And then I need to
接着我就需要堆化这个新的数组 所以我还要向上整理堆元素的顺序
actually heapify up so I need to fix the heap looking upwards. Swapping
each element with its parent as necessary, so in this case if we want to
这里比如我们要添加一个新元素8 我们把它加到最后一个元素位置上
add an element say, 8, we’d add it at the very last element and then we’d go and
adjust our heap moving things up as necessary.
真正好玩的来了 这里我要定义这两个堆化方法
Now for the real fun. I need to actually implement these heapify methods so
向上堆化就是 当最后一位元素加入进来时
heapify up is going to start with the very last element added which is that
最后一位元素的位置就是数组大小减1 然后这个元素就会向上移动
size minus 1 and then it’s going to walk up as long as there’s a parent item and
只要它还有父元素而且顺序也不对的话 就会一直进行下去 只要元素的父元素
as long as essentially I’m out of order. So as long as my parent item is bigger
大于该元素 那顺序就不对 就要交换它们的顺序
than me then hey, things are out of order, so swap my parent index, swap my value
子元素和父元素的值对调 然后继续往上走
with my parent and then walk upwards.
So let’s walk through this on the 8 that was inserted. So what we do here is
我们先比较8和15 顺序不对 调换它们两个的位置
we’ll compare 8 to this 15, it’s out of order so we’ll need to swap those values.
15向下 8向上 在数组里就是这样
So 15 goes down here and 8 goes up here or on the array it will look like this. Then
然后比较8和10 还是顺序不对
we compare this 8 to the 10 and then that’s still out of order, and so we’ll go
那么我们就要把10挪下来 把8挪上去 现在我们回到
and move the 10 down and swap the 8 up there and now we’ve returned to the
根节点 看看符不符合堆的性质 8在最上面 10和20在下面
root, to the heap properties. We have 8 at the top, 10 and 20 below it then 17
然后17和15在更下面 向下堆化会更复杂一点
and 15 below that. Heapify down is a little bit more complicated but it’s
但还是可以做的 首先我们要从根元素开始
still quite manageable. First we’re going to start off with our root element which
就是索引0位置 然后就是只要还有子元素
is at index zero. And then we’re going to say well as long as I have children, keep
walking down and trying to fix up my heap.
现在我只要检查是否有左元素 因为如果没有左元素
Now I only need to check if there’s a left child because if there’s no left
那肯定就不会有右元素了 接下来我要先将
child then there’s certainly no right child. Then I’m gonna set this smaller
左元素赋值给 smallerChildIndex 这个变量
child index equal to the smaller of the left and the right child so I’m going to
也就是假设左元素是比较小的那个子元素(smallerChildIndex) 然后我就发现
take it guess with, set it equal to the left child, and then I’m gonna say hey if
there’s a right child,
结果右元素比左元素还小 那右元素
and my right child is even smaller than my left child, small child index should
就是更小的那个子元素 那接下来
equal my right child. Now what I’m going to say,
记着我可是沿着堆往下走的 那么接下来
remember I’m looking downwards on the heap, so now what I’m gonna say is hey, if items of
如果索引项 如果某个元素比两个子元素都小 那么就没问题
index, if I’m smaller than the smaller of my two children then everything’s good,
一切都是正常的 我就可以退出了
and everything’s back in order here and I can just exit.
如果不是这样 那说明堆里元素的顺序还不对
If that’s not the case then our heap is still out of order and then I need to
那我就要把这个元素和 smallerChildIndex 对调 然后继续向下
swap my value with my smaller child and then move down to my smaller child. I’ll
just actually move this out here.
Alright so that’s the basics of how heapify down works. Let’s walk through
我们来看个例子吧 这个例子我放大点 这样
this on an example. I’ll make this example slightly larger so if we do an extract
我们还是先移除10 然后用25替换
min such that 20, 10 gets removed then we remove 10, we replace it with 25. I’ll do
对于数组我也是同样做法 这样你就可以看到过程了 然后我们比较25
it on the array too so you can see what’s going on there. Then we compare 25, the
和根元素 然后用左右子元素中小的那个来替换它
root, and replace it with the smaller of its left and right child. So we swap the
我们交换25和15 25放这里 15放上来 数组这里
25 and the 15, so 25 comes down here, 15 comes up there, and we can do it down
也是这样 所以 15上来 25下去 然后比较25和17
here too. So 15 goes here 25 goes here then we compare 25 to 17. It’s still out of
顺序还是不对因为25大于17 所以把17挪上来
order since 25 is bigger than 17 so we do 17 comes up here we swap
然后25下来 然后堆就变成了15
those and 25 goes down here. So now we have a heap again that looks like 15, 17
17 20 然后25 最小堆的性质就满足了
20, and 25 and as you can see our min heap property has been restored. So now
现在你可以理解堆是什么 它是怎么运转的了
that you understand what a heap is and how it works,
你可以试试用堆解决一个新问题 祝你好运
why don’t you try out using a heap on a new problem. Good luck.