未登录,请登录后再发表信息
最新评论 (0)
播放视频

数据结构:#4 数组vs链表

Data Structures: Arrays vs Linked Lists

In our previous lesson, we introduced you to linked list data structure
上节课 我们介绍了链表这种数据结构
and we saw how linked lists solve some of the problems that we have with arrays.
也讲了链表如何解决数组存在的一些问题
So now the obvious question would be which one is better – an array or a linked list.
那么 数组和链表哪个更好呢?
Well, there is no such thing as one data structure is better than another data structure.
其实 不能说某种数据结构要比另一种更好
One data structure may be really good for one kind of requirement,
因为一种数据结构可能适用于某一种需求
while another data structure can be really good for another kind of requirement.
而另一种数据结构适用于另一种需求
So, it all depends upon factor like what is the most frequent operation that you want
这取决于一些因素 比如对数据最频繁的操作是什么
to perform with the data structure or what is the size of the data
或者待处理的数据量是多大
and there can be other factors as well.
当然也有其他一些因素
So, in this lesson, we will compare these two data structures based on some parameters,
那么 本节课 我们将根据参数和操作复杂度
based on the cost of operations that we have with these data structures.
对这两种数据结构进行比较
So, all in all we will comparatively study the advantages and disadvantages and try to
总之 我们会用对比的方式来研究两种结构的优缺点
understand in which scenario we should use an array
了解哪种情况下应该用数组
and in which scenario we should use a linked list.
哪种情况下应该用链表
So, I will draw two columns here, one for array and another for linked list
我来画一个双栏对比图 一栏数组 一栏链表
and the first parameter that we want to talk about is the cost of accessing an element.
我们要讨论的第一个参数是访问元素的复杂度
Irrespective of the size of an array
不论数组的大小如何
it takes constant time to access an element in the array.
只需要常数时间就可以访问数组中的元素
This is because an array is stored as one contiguous block of memory.
这是因为数组在内存中是连续存储的
So, if we know the starting address or the base address of this block of memory.
只要我们知道内存块的起始地址 即基地址
Let us say what we have here is an integer array and base address is 200.
假设有一个整型数组 基地址是200
The first byte in this array is at address 200.
数组第一个字节的地址就是200
Then let’s say if we want to calculate the address of element at index i,
那么如果要计算第i个元素的元素的地址
then it will be equal to 200 plus i into size of an integer in bytes.
它等于200加上i个元素的字节大小
So, size of an integer in bytes is typically 4 bytes.
一个整型数据的大小通常是4个字节
So, it will be 200 + 4*i.
那么 就是200 + 4*i
So, if 0th element is at address 200
如果第0个元素的地址是200
if we want to calculate the address for element
我们要计算第6个元素的地址
at index 6, it will be 200 plus 6 into 4 which will be equal to 224.
那就是200加6乘4等于224
So, knowing address of any element in an array is just this calculation for our application.
因此 任意元素的位置都可以通过计算得到
In terms of big-oh notation, constant time is also called O(1).
在大O表示法中 常数时间又记作O(1)
So, accessing an element in an array is O(1) in terms of time complexity.
因此 访问数组元素的时间复杂度是O(1)
If you are not aware of big-oh notation
如果不了解大O表示法
check the description of this video for a tutorial on time complexity analysis.
视频下方说明里有时间复杂度分析的教程
Now, in a linked list, data is not stored in a contiguous block of memory.
而对于链表 数据并不是储存在连续的内存块中
So, if we have a linked list something like this, let’s say we have a linked list of integers here,
如果我们有一个这样的链表 假定它是个int型链表
then we have multiple blocks of memory at different addresses.
它被储存在许多地址不同的内存块中
Each block in the linked list is called a node and each node has two fields,
每个内存块称为一个结点 每个结点包含两个数据域
one to store the data and one to store the address of the next node.
一个储存数据 一个储存下一结点的地址
So, we call the second field, the link field.
我们称后者为连接域
The only information that we keep with us about a linked list is the address of the first node
我们只需保留链表第一个节点的地址
which is also called the head node.
即头结点的地址
And this is what we keep passing to all the functions also, the address of the head node.
传递给各个函数的也是头结点的地址
To access an element in the linked list at a particular position
要访问链表中特定位置的元素
we first need to start at the head node or the first node
需要从头结点 或第一个结点开始遍历
then we go to the second node and see the address of the third node.
到第二个结点 然后得到第三个结点的地址
In the worst case, to access the last element in the list,
最坏的情况是访问链表的最后一个结点
we will be traversing all the elements in the list.
这就需要遍历整个链表的所有元素
In the average case, we will be accessing the middle element may be.
平均情况下 我们要访问链表中间的元素
So, if n is the size of the linked list, n is the number of elements in the linked list,
假设链表的大小为n 即链表有n个元素
then we will traverse n/2 elements.
那么就要遍历n/2个元素
So, the time taken in the average case
因此 平均情况时所需的时间
also is proportional to the number of elements in the linked list.
也与链表的元素数量成正比
So, we can say that the time complexity in average case is O(n).
因此 平均情况的时间复杂度是O(n)
So, on this parameter, cost of accessing an element, array scores heavily over linked list.
对于访问元素的复杂度 数组远优于链表
So if you have a requirement where you want to access elements in the list all the time,
如果你需要频繁访问列表元素
then definitely array is a better choice.
数组绝对是更好的选择
Now, the second parameter that we want to talk about is
接下来 我们要讨论的第二个参数是
memory requirement or memory usage.
内存需求 或称内存使用情况
with an array, we need to know the size of the array before creating it,
创建数组之前 我们需要知道它的大小
because arrays is created as one contiguous block of memory.
因为数组需要占用一段连续的内存块
So, array is of fixed size.
因此 数组大小是固定不变的
What we typically do is create, we create a large enough array
常用的方法是创建一个足够大的数组
and some part of the array stores our list
一部分空间用来存储数据列表
and some part of the array is vacant or empty so that we can add more elements in the list.
其他部分空着 以备存放后续数据
For example, we have an array of 7 integers here and we have only 3 integers in the list.
例如 一个大小为7的整型数组 里面只有3个整型变量
Rest 4 positions are unused.
其他4个位置没有用到
There would be some garbage value there.
这就会产生一些无用变量
With linked list, lets say we have, let’s say we have this linked list of integers,
对于链表 我们可以说
there is no unused memory.
一个整型链表不会造成内存浪费
We ask memory for one node at a time, so we do not keep any reserved space.
每次只申请一个结点的内存 无需预留
But we use extra memory for pointer variables
但需要额外的内存来储存指针变量
and this extra memory requirement for a pointer variable in a linked list can not be ignored.
链表中指针变量的内存需求不能忽视
In a typical architecture let’s say
在典型的指令集架构中
integer is stored in 4 bytes and pointer variable also takes 4 bytes.
整型数据占用4字节 指针也是4字节
So, if you see, the memory requirement for this array of 7 integers is 28 bytes.
那么储存7个整型变量的数组占用28个字节
And the memory requirement for this linked list would be 8*3,
而用链表来存放则需要8*3个字节
where 8 is the size of each node, 4 for integer and 4 bytes for the pointer variable.
其中 8是每个结点的大小 即4字节整数和4字节指针
So, this is also 24 bytes.
也就是24个字节
If we add one more element to the list in the array, we will just use one more position,
如果增加一个元素 只需占用数组一个空位
while in linked list we will create one more node, and will take another 8 bytes,
而链表中 得创建新结点 又需要8个字节
so this will be 32 bytes.
总共占用32个字节
Linked list would fetch us a lot of advantage if the data, the data part is large in size.
如果要存放大量数据 链表会占很大优势
So, in this case, we had a linked list of integers, so integer is only 4 bytes.
刚才是整型链表中数据仅占4个字节的情况
What if we had a linked list in which the data part was
那如果有一个链表 数据域更为复杂
some complex type that took 16 bytes.
占用了16个字节 又会怎样呢?
So, 4 bytes for the link and 16 bytes for the data, each node would have been 20 bytes.
4字节指针 16字节数据 每个结点共20字节
An array of 7 elements for 16 bytes of data would be
而拥有7个元素的数组
16 byte for each element would be 112 bytes.
每个元素16字节 共112字节
And linked list of 4 would be only 80 bytes.
而4个结点的链表只需80字节
So, it all depends.
因此 要视情况而定
If the data part of a list takes a lot of memory,
如果列表的数据部分占用很大内存
linked list will definitely consume lot less memory.
用链表可以减少很多内存消耗
Otherwise, it depends what strategy we are choosing to decide the size of the array.
反之 所占内存多少由数组大小决定
At any time how much array are we keep unused.
不论何时 数组都需要预留内存
Now, one more point with memory allocation
另外 关于内存分配还要补充一个知识点
because arrays are created as one contiguous block of memory,
因为数组存储在连续的内存块中
sometimes when we may want to create a really really large array, then
有时可能需要创建一个特别大的数组
maybe memory may not be available as one large block,
可能无法储存在一整块内存中
but if we are using linked list, memory may be available as multiple small blocks.
但用链表的话 数据可以储存在零散的内存块中
So, we will have this problem of fragmentation in the memory.
我们把这种情况称为内存碎片化
Sometimes, we may get many small units of memory
有时可能只是许多小的内存单元
but we may not get one large block of memory.
却不是一个大的内存块
This may be a rare phenomenon, but this is a possibility.
这种情况或许不常见 但存在这种可能性
So, this is also where linked list scores.
因此 这也是链表的优势
Because arrays have fixed size, once array gets filled and we need more memory, then
由于数组的大小不变 一旦填满 需要更多内存时
there is no other option than to create a new array of larger size
就只能创建另一个更大的数组
and copy the content from the previous array into the new array.
并且把原数组的内容复制到新数组中
So, this is also one cost which is not there with linked list.
这也是用数组代替链表的代价
So, we need to keep these constraints and these requirements in mind when we want to
因此 根据需求选用数据结构时
decide for one of these data structures for our requirement.
需要牢记这些限制和要求
Now, the third parameter that we want to talk about is
接着 我们要讨论的第三个参数是
cost of inserting an element in the list.
在列表中插入元素的复杂度
Remember when we are talking about arrays here, we are also talking about
请记住 这里我们讨论的数组包括
the possible use of array as dynamic list.
可能用到的动态分配空间的数组
So, there can be 3 scenarios in insertion.
那么有3种插入数据的形式
First scenario will be when we want to insert an element at the beginning of the list.
第一种是在列表首部插入数据
Let’s see we want to insert number 3 at the beginning of the list.
假设我们要在列表首部插入数字3
In the case of arrays,
对数组而言
we will have to shift each element by one position towards the higher index.
不得不把每个元素都依次后移一位
So, the time taken will be proportional to the size of the list.
因此 所需时间与数组大小成正比
So, this will be O(n).
也就是O(n)
Let’s say n is the size of the list.
假设列表大小是n
This will be O(n) in terms of time complexity.
时间复杂度就是O(n)
In the case of linked list, inserting a node in the beginning
而对于链表而言 要在首部插入一个结点
will mean only creating a new node and adjusting the head pointer and the link of this new node.
只需创建新的结点 调整头结点和新结点的指针
So, the time taken will not depend upon the size of the list, it will be constant.
因此 所需时间是常数 与列表大小无关
So, for linked list
所以 对链表来说
inserting an element at the beginning is O(1) in terms of time complexity.
在开头插入元素的时间复杂度为O(1)
Inserting an element at end for an array, let’s say we are talking about dynamic array,
如果在尾部插入元素 这里讨论的是动态数组
a dynamic list in which we create a new array if it gets filled.
如果填满了就新建一个空间的动态列表
If there is space in the array, we just write to the next higher index of the list.
如果数组中还有空位 只需继续写入元素
So, it will be constant time.
这样只需要常数时间
So, time complexity is O(1) if array is not full.
即数组未满时的时间复杂度为O(1)
If array is full, we will have to create a new array and copy all the previous content
若数组已满 需新建数组并复制原有数据
into new array which will take O(n) time where n is the size of the list.
若数组大小为n 时间复杂度就是O(n)
In the case of linked list, adding an element, inserting an element at the end will mean
对链表来说 在末尾添加一个元素意味着
traversing the whole list and then creating a new node and adjusting the links.
遍历整个链表 新建一个结点并调整指针
So, time taken will be proportional to n.
因此 所需时间与n成正比
I will use this color coding for linked list.
我用红色标上链表的时间复杂度
Here n is the number of elements in the list.
这里的n是列表中元素的数量
Now, the third case will be when we want to insert in the middle of the list
第三种是在列表中间插入元素
at some nth position or may be some ith position.
比如在第n个位置 或者第i个位置
Again in the case of arrays, we will have to shift elements.
对数组而言 仍需将元素依次后移
For the average case, we may want to insert at the mid position in the array.
平均情况下 需要在数组正中间插入元素
So, will have to shift n/2 where n is the number of elements in the list.
因此 数组大小为n 则需要后移n/2个元素
So, the time taken is definitely proprotional to n in average case.
平均情况下 所需时间与n成正比
So, complexity will be O(n).
因此 时间复杂度为O(n)
For linked list also we will have to traverse
对链表而言 仍然需要遍历
till that position and then only we can adjust the links.
直到找出位置 然后仅需调整指针
Even though we will not have any shifting, we will have to traverse till that point and
虽然不需要后移 但也要遍历到那个位置
in the average case, time taken will be proportional to n
平均情况下 所需时间也和n成正比
and the time complexity will be O(n).
即时间复杂度为O(n)
If you see , deleting an element will also have these 3 sceanrios
显然 删除一个元素 也分为3种形式
and the time comeplxity for deleting for these 3 sceanrios will also be the same.
其时间复杂度也和插入是相同的
And the final point, the final parameter that I want to talk about
最后一个需要讨论的参数是
is which one is easy to use and implement.
哪种结构更容易使用和实现
An array definitely is a lot easier to use.
显然 数组更容易使用
Linked list implemetation especially in C or C++ is more prone to errors
链表更容易出现错误 尤其是在C或C++中
like segmentation fault and memory leaks.
比如段错误和内存泄漏等问题
It takes good care to work with linked lists.
使用链表时 需要多加注意
So, this was arrays vs linked lists.
这就是数组和链表的对比
In our next lesson, we will implement linked list in C or C++.
下节课 我们将在C或C++中应用链表
We will get our hands dirty with some real code.
我们会在真实的编程环境中进行演练
So this is it for this lesson.
以上是本节课的全部内容
Thanks for Watching !
感谢观看!

发表评论

译制信息
视频概述

通过3个指标,对数组和链表的时间复杂度进行分析,是一篇基础的数据结构课程。

听录译者

收集自网络

翻译译者

霜影丶Frostwolf

审核员
视频来源

https://www.youtube.com/watch?v=lC-yYCOnN8Q

相关推荐