大家好 我是《程序员面试金典》的作者 盖尔·拉克曼·麦克道尔
Hi, I’m Gayle Laakmann McDowell, author of Cracking the Coding Interview.
Today I’m going to talk about tries.
This isn’t something CS students might have spent that much time on in school.
But it’s really really important for interviews.
A trie is a data structure that is actually type of tree, but it’s often used to store characters.
And the way that it works is each node might store a,just a character as its data,
but then if we look at the path from the root down to that node
that node is really representing a word or a part of a word.
And so what it allows us to do is very quick lookups of a particular kind.
So for example, suppose we use this trie to
store this very small version of the English language.
有以 C.A.R 开头的单词吗
We could use it to say, hey, are there any words that start with C.A.R
有的 car 就是 这就是一个完整的单词
Yes there are.There’s car, that’s actually a complete word.
但也有 card 其中 car 仅仅是前半部分
But there’s also card. So car is also a prefix.
So it allows you to do very fast lookups.
So let’s talk briefly about the implementation of it.
So one, you know, we’ll gona need a class,Node.
And then rather than having as we would in a binary search tree,
a pointer to the left and right node,
instead we have some sort of either array of the, of all the children
or more likely some sort of look up table that maps from a character to that node.
So we can immediately say do you have a, have a child that is an A.
Great.Give me that immediately.
Additionally we need to understand not just is there a,
do you have an R as a child, an R node as a child,
but is that actually a complete word or is it just a prefix.
And so in addition to each node having any,
a mapping of character from character to node
each node also has some sort of flag called something like is complete word
that represents is that a complete word or not.
We see tries in a lot of problems so if you’re in interview and
you get a question that has some sort of word validation to it so, you know,
check if this is you know, walk through this table and find all the, you know,
walk through this scrabble board and find all the words.
Or given this list of strings, find all celebrity names.
So when you see something that has some word list validation
think about whether or not a trie might be useful.
There’s one optimization that often comes up in trie problems.
Suppose you had a situation where a user was typing a word
and as soon as the word became,
and it clearly was going to be an invalid word, you want to say underline it.
有人敲了 C 这是可以的
Someone types C and you say yes that is in fact
it potentially going to be a valid word.
然后是 CA 没什么问题
And then someone types CA. Ok great, we are looking good.
现在输入为 CAL 有单词以CAL 开头
Now they type CAL. Yes okay there’s words that start with CAL,
现在的输入是CALP 哦 出错了
Now they type CALP, oops no.
That’s an error. So that’s a good place to use a try.
Now the way that I would explain tries so far, you would do each of these look ups from scratch.
有 C 没什么问题
You would say is there a C, is that a valid prefix, yes there is.
从根节点搜索可以发现 CA 也是单词的前缀
And then you do is C A a valid prefix and you start from scratch over at the root.
Now in a lot of cases though we actually want these calls to kind of build on each other.
所以 我们可以改进搜索方式 存储搜索的状态
So one thing we can do with our trie is we can keep state in various ways.
我们找到 C 以后 我们记下在树中的位置
so when we look up is C a valid prefix, we actually store where we are in the tree.
So when we, when the user types the very next character, an A,
we just look immediately is A a child rather of C of
而是看 C 的子节点里有没有 A
that last look up rather than starting from scratch
There’s different ways you can actually implement that.
One way you can do it is to actually keep state within the trie.
Another way you could do that is actually return the node back to the caller.
当我知道 C 有效的时候 不仅知道这是个有效前缀
So when I say is C a valid prefix, I get not only that’s a valid prefix,
but I also get information that gives me exact reference to that node.
相比于从根节点搜索得知 CA 为有效前缀
And rather than saying starting from the root and saying is CA a valid prefix
我只需要知道 A 是 C 的子节点 然后去看 L 是不是 CA 的子节点 以此类推
I just ask is A a child of C, and then I ask is L a child of the CA etc. etc.
So again tries are not something that you would spend a lot of time in
algorithms classes or in algorithms textbooks, and so they’re unfamiliar to some people.
But it’s a very very common data structure for a lot of interview questions.
So keep an eye out for that, you know,
construct or that frame of a problem that says something about your validation of
words because that’s often where we’re going to want to use a trie.
So now that you understood the basics of the trie,
why don’t you try applying this to a new problem. Good luck.
大家好 我是《程序员面试金典》的作者 盖尔·拉克曼·麦克道尔