ADM-201 dump PMP dumps pdf SSCP exam materials CBAP exam sample questions

数据结构:字典树 – 译学馆
未登录,请登录后再发表信息
最新评论 (0)
播放视频

数据结构:字典树

Data Structures: Tries

大家好 我是《程序员面试金典》的作者 盖尔·拉克曼·麦克道尔
Hi, I’m Gayle Laakmann McDowell, author of Cracking the Coding Interview.
今天 我来给大家讲讲字典树
Today I’m going to talk about tries.
可能计算机专业的学生在学校没怎么学
This isn’t something CS students might have spent that much time on in school.
但是面试中字典树真的很重要
But it’s really really important for interviews.
字典树是树的一种 通常用来存储单词
A trie is a data structure that is actually type of tree, but it’s often used to store characters.
字典树中的每一个节点存储一个字母
And the way that it works is each node might store a,just a character as its data,
如果从根节点向下查找
but then if we look at the path from the root down to that node
路径上所经过的节点就是整个单词或是单词的前半部分
that node is really representing a word or a part of a word.
字典树的这个特性允许我们快速地进行一些特定的查找
And so what it allows us to do is very quick lookups of a particular kind.
例如 我们用字典树
So for example, suppose we use this trie to
来存储英语中的一小部分单词
store this very small version of the English language.
有以 C.A.R 开头的单词吗
We could use it to say, hey, are there any words that start with C.A.R
有的 car 就是 这就是一个完整的单词
Yes there are.There’s car, that’s actually a complete word.
但也有 card 其中 car 仅仅是前半部分
But there’s also card. So car is also a prefix.
这样一来可以进行快速查找
So it allows you to do very fast lookups.
下面我们简单讲讲字典树的实现
So let’s talk briefly about the implementation of it.
首先我们需要节点类
So one, you know, we’ll gona need a class,Node.
不同于二叉树中的节点
And then rather than having as we would in a binary search tree,
二叉树的节点存储有左右子节点
a pointer to the left and right node,
对于字典树我们会预先存有子节点的列表
instead we have some sort of either array of the, of all the children
或者说 记录子节点对应的字母的映射
or more likely some sort of look up table that maps from a character to that node.
因此 我们马上就能找到树里面有个节点存储了A
So we can immediately say do you have a, have a child that is an A.
我们现在就拿到这个节点
Great.Give me that immediately.
另外 除了子节点中有没有R之外
Additionally we need to understand not just is there a,
我们还需要知道查找到的字母组合
do you have an R as a child, an R node as a child,
是不是完整的单词 或是单词前缀
but is that actually a complete word or is it just a prefix.
所以每个节点除了
And so in addition to each node having any,
要存储字母到节点的映射外
a mapping of character from character to node
还要存储 判断标志位
each node also has some sort of flag called something like is complete word
标记到此的查找是否能够构成一个单词
that represents is that a complete word or not.
面试中很多问题都涉及到字典树
We see tries in a lot of problems so if you’re in interview and
如果遇到单词验证的问题
you get a question that has some sort of word validation to it so, you know,
比如 遍历表格找到所有……
check if this is you know, walk through this table and find all the, you know,
从杂乱的表格中查找单词
walk through this scrabble board and find all the words.
或者是给定字符串列表 找到所有名人的名字
Or given this list of strings, find all celebrity names.
当遇到单词验证问题的时候
So when you see something that has some word list validation
不妨试试字典树
think about whether or not a trie might be useful.
下面讲讲字典树典型的优化案例
There’s one optimization that often comes up in trie problems.
假设 有个用户正在输入单词
Suppose you had a situation where a user was typing a word
一旦单词很明显
and as soon as the word became,
拼写有误 则以下划线标注
and it clearly was going to be an invalid word, you want to say underline it.
有人敲了 C 这是可以的
Someone types C and you say yes that is in fact
因为这有可能成为一个正确的单词
it potentially going to be a valid word.
然后是 CA 没什么问题
And then someone types CA. Ok great, we are looking good.
现在输入为 CAL 有单词以CAL 开头
Now they type CAL. Yes okay there’s words that start with CAL,
现在的输入是CALP 哦 出错了
Now they type CALP, oops no.
这就是字典树的最简单用法
That’s an error. So that’s a good place to use a try.
到目前为止 我们都是从根节点开始查找
Now the way that I would explain tries so far, you would do each of these look ups from scratch.
有 C 没什么问题
You would say is there a C, is that a valid prefix, yes there is.
从根节点搜索可以发现 CA 也是单词的前缀
And then you do is C A a valid prefix and you start from scratch over at the root.
但其实这些搜索都可以根据上一次搜索进行简化
Now in a lot of cases though we actually want these calls to kind of build on each other.
所以 我们可以改进搜索方式 存储搜索的状态
So one thing we can do with our trie is we can keep state in various ways.
我们找到 C 以后 我们记下在树中的位置
so when we look up is C a valid prefix, we actually store where we are in the tree.
如果用户输入了下个字母 A
So when we, when the user types the very next character, an A,
我们就从不用从根节点查找
we just look immediately is A a child rather of C of
而是看 C 的子节点里有没有 A
that last look up rather than starting from scratch
这个操作的实现方式有很多种
There’s different ways you can actually implement that.
其中之一是在字典树本身存储查找状态
One way you can do it is to actually keep state within the trie.
另一种是将搜索到的节点返回调用函数
Another way you could do that is actually return the node back to the caller.
当我知道 C 有效的时候 不仅知道这是个有效前缀
So when I say is C a valid prefix, I get not only that’s a valid prefix,
还拿到了节点的引用
but I also get information that gives me exact reference to that node.
相比于从根节点搜索得知 CA 为有效前缀
And rather than saying starting from the root and saying is CA a valid prefix
我只需要知道 A 是 C 的子节点 然后去看 L 是不是 CA 的子节点 以此类推
I just ask is A a child of C, and then I ask is L a child of the CA etc. etc.
所以说 算法课上不会花太长时间学习字典树
So again tries are not something that you would spend a lot of time in
一部分学生也不太熟悉字典树
algorithms classes or in algorithms textbooks, and so they’re unfamiliar to some people.
但它是面试题中非常常见的数据结构
But it’s a very very common data structure for a lot of interview questions.
所以要多留心
So keep an eye out for that, you know,
当遇到了需要进行单词验证的问题的时候
construct or that frame of a problem that says something about your validation of
字典树就派上用场了
words because that’s often where we’re going to want to use a trie.
既然你理解了字典树的基本原理
So now that you understood the basics of the trie,
那么不妨试试用它解决些问题 祝你好运
why don’t you try applying this to a new problem. Good luck.

发表评论

译制信息
视频概述

本视频介绍了字典树的原理及应用

听录译者

收集自网络

翻译译者

喔喔243

审核员

审核团O

视频来源

https://www.youtube.com/watch?v=zIjfhVPRZCg

相关推荐