ADM-201 dump PMP dumps pdf SSCP exam materials CBAP exam sample questions

算法如何工作与学习 – 译学馆
未登录,请登录后再发表信息
最新评论 (0)
播放视频

算法如何工作与学习

How Machines Learn

在网络中 算法随处可见
On the internet, the algorithms are all around you.
你正在看这个视频 是因为算法让你在诸多视频中看到它
You are watching this video because an algorithm brought it to you (among others) to click,
你点击了它 算法将你的行为记录下来
which you did, and the algorithm took note.
你打开推特时 是算法决定你会看到什么
When you open the TweetBook, A the algorithm decides what you see.
你搜索图片时 是算法负责检索
When you search through your photos, A the algorithm does the finding.
甚至它还可以帮你制作一部电影
Maybe even makes a little movie for you.
你购买东西时 是算法定的价
When you buy something, A the algorithm sets the price
算法也在银行交易的时候监控是否有欺诈
and A the algorithm is at your bank watching transactions for fraud.
股票市场里到处是算法
The stock market is full of algorithms
交易时也用到算法
trading with algorithms.
鉴于此 你可能会想知道
Given this, you might want to know
这些影响你生活的算法程序是如果工作的
how these little algorithmic bots shaping your world work,
特别是当这些算法失效的时候
especially when they don’t.
在过去
In Ye Olden Days,
人们通过下达自己可解释的指令来建立算法
humans built algorithmic bots by giving them instructions the humans could explain.
“若什么 则什么”
“If this, then that.”
但很多问题太庞大也太过复杂
But many problems are just too big and hard
人类不可能写出清晰简单的指令
for a human to write simple instructions for.
每秒有无数次金融交易发生 哪些是诈骗呢?
There’s a gazillion financial transactions a second, which ones are fraudulent?
网上有数不清的视频
There’s octillion videos on NetMeTube.
哪8个应该被选出来放在推荐中呢?
Which eight should the user see as recommendations?
哪些又根本不应该出现在网页上?
Which shouldn’t be allowed on the site at all?
对于航班座位
For this airline seat,
某位消费者愿意支付的最高价格是多少?
what is the maximum price this user will pay right now?
算法会给出这些问题的答案
Algorithmic bots give answers to these questions.
虽然并不完美 但比人类能给的好太多了
Not perfect answers,but much better than a human could do.
但这些算法到底是怎么准确运作且运作越来越多的 并没有人知道
But how these bots work exactly, more and more, no one knows.
甚至建立这些算法的人也不清楚
Not even the humans who built them,
指导这些人的人同样不知道
or”built them”,
正如我们所见
as we will see…
如今 使用这些算法的公司
Now companies that use these bots
并不想谈论算法是如何运作的
don’t want to talk about how they work
因为算法是对公司有价值的雇员
because the bots are valuable employees. Very,
而且是 价值连城的
VERY valuable.
算法是如何运作的是被重点保护的机密
And how their brains are built is a fiercely guarded trade secret.
如今这项前沿技术很可能像
Right now the cutting edge is most likely very
“但愿你对线性代数感兴趣”
‘I hope you like linear algebra’,
但在特定的网站 当前的热点是什么
but what the current hotness is on any particular site
算法怎么运作 还处于并将永远处于一个“我们不知道”的状态
and how the bots work,is a bit”I dunno”, and always will be.
所以我们来讨论一种更奇妙但好理解的创造程序的方法
So let’s talk about one of the more quaint but understandable ways bots CAN be”built”
而不用理解程序核心如何运作
without understanding how their brains work.
假设你需要一个能识别图片内容的程序
Say you want a bot that can recognize what is in a picture.
这是蜜蜂 还是数字3?
Is it a bee, or is it a three?
对人类(即使是小孩)来说这很简单
It’s easy for humans (even little humans),
但是很难用程序语言
but it’s impossible to just tell a bot
告诉程序如何去做
in bot language how to do it,
因为我们就是知道那是只蜜蜂 那是数字3
because really we just know that’s a bee and that’s a three.
我们能用语言来区分它们
We can say in words what makes them different,
但是程序不能理解语言
but bots don’t understand words.
我们的大脑结构使得我们能够理解语言
And it’s the wiring in our brains that makes it happen anyway.
不过我们或许能了解单个的神经元
While an individual neuron may be understood,
模糊的理解一组神经元的用处
and clusters of neurons’ general purpose vaguely grasped,
但无法理解大脑整体的运作 尽管如此
the whole is beyond. Nonetheless,
这行得通
it works.
所以让程序来做这件事
So to get a bot that can do this sorting,
你不用自己做
you don’t build it yourself.
设计一个能写程序的程序 一个能教程序的程序
You build a bot that builds bots, and a bot that teaches bots.
这些程序的核心比较简单 一个聪明的程序员就能搞定
These bots’ brains are simpler, something a smart human programmer can make.
让程序创造程序 尽管它可能做得不太好
The builder bot builds bots, though it’s not very good at it.
一开始 它几乎是随机把程序中心的
At first it connects the wires and modules
线路和模块连接起来
in the bot brains almost at random.
然后产生一些非常……
This leads to some very…
“特别的”学生程序 这些学生程序被送到教师程序那里去学习
“special” student bots sent to teacher bot to teach.
当然 教师程序也不能分辨蜜蜂和数字3
Of course, teacher bot can’t tell a bee from a three either;
要是人类编写出可以分辨的教师程序
if the human could build teacher bot to do that, well,
那 问题就解决了
then, problem solved.
相反 人类给教师程序大量蜜蜂和数字3的图片
Instead the human gives teacher bot a bunch of”bee” photos, and”three” photos,
还有分辨二者的关键词
and an answer key to which is what.
教师程序不能教给学生程序 但是能测试
Teacher bot can’t teach, but teacher bot can TEST.
傻里傻气的学生程序吐着舌头 努力尝试
The adorkable student bots stick out their tongues, try very hard,
但是做得很差
but they are bad at what they do. Very,
特别差
VERY, bad.
这也不是它们的错 它们就是这么被创造的
And it’s not their fault, really,they were built that way.
带着自己的成绩 学生程序羞耻的回到
Grades in hand, the student bots take a march
创造者那里
of shame back to builder bot.
做得好的被放到一边
those that did best are put to one side,
其它的回收
the others recycled.
创造者程序依然不擅长创造程序
Builder bot still isn’t good at building bots,
但现在它复制剩下的程序 并做一些新的组合变换
but now it takes those left and makes copies with changes in new combinations.
这些新的组合程序回到学校
Back to school they go.
教室程序再次讲授 呃 测试 创造者程序再次创造
Teacher bot teaches – er, tests again, and builder bot builds again.
一次次重复
And again, and again.
创造者随机创造
Now a builder that builds at random,
教师不会教导 只会测试
and a teacher that doesn’t teach, just tests,
学生不会学习
and students who can’t learn,
它们就是它们自身 理论上行不通
they just are what they are, in theory shouldn’t work,
但实际中 可以
but in practice, it does.
部分因为在每次迭代中
Partly because in every iteration,
创造程序会留下表现好的程序 丢掉不好的
builder bot’s slaughterhouse keeps the best and discards the rest,
部分因为教师程序不像传统形式那样监督
and partly because teacher bot isn’t overseeing an old-timey,
一间教室里十几个学生
one-room schoolhouse with a dozen students,
而是一个无限大的仓库里面数以万计的学生
but an infinite warehouse with thousands of students.
测试问题也不是10个 而是数十万个
The test isn’t ten questions, but a million questions.
这种创建 测试的循环要重复多少次呢?
And how many times does the test, build, test loop repeat?
需要多少次就循环多少次
As many as necessary.
一开始学生能留存下来只是幸运
At first students that survive are just lucky,
通过组合大量的幸运程序 只留下能用的
but by combining enough lucky bots, and keeping only what works,
把它们随机混合
and randomly messing around with new copies of that
最终得到的学生程序就不是因为幸运
eventually a student bot emerges that isn’t lucky,
它也许勉强能分辨蜜蜂和树
that can perhaps barely tell bees from threes.
这个程序被复制 改进
As this bot is copied and changed,
测试分数慢慢提高
slowly the average test score rises,
因此 在下一轮测试中存活下来需要的分数也越来越高
and thus the grade needed to survive the next round gets higher and higher.
坚持这么做最终在仓库
Keep this up and eventually from the infinite warehouse
(屠宰站)
(slaughterhouse)
会出现一个学生程序
a student bot will emerge, who can tell
可以从它从未见过的图片中
a bee from a three in a photo it
很好地分辨出蜜蜂和数字3
‘s never seen before pretty well.
但是它是怎么做到的
But how the student bot does this,
教师程序和创造者程序都不知道
neither the teacher bot nor the builder bot,
监督人员也不知道
nor the human overseer, can understand.
学生程序自己也不知道
Nor the student bot itself.
经过这么多有用的随机变换
After keeping so many useful random changes,
程序内的连接已相当复杂
the wiring in its head is incredibly complicated,
一行代码是可以理解的
and while an individual line of code may be understood,
一串代码的用处也可以大概的了解
and clusters of code’s general purpose vaguely grasped,
整个程序却无法理解 不管怎样 它能用
the whole is beyond, nonetheless, it works.
却令人沮丧
But this is frustrating,
尤其是学生程序仅仅非常擅长
especially as the student bot is very good at exactly
它被训练的问题
only the kinds of questions it’s been taught to.
它能很好的识别图片
It’s great with photos,
但是对视频和倒置的图片却无能为力
but useless with videos or baffled if the photos are upside down,
还会把有些明显不是蜜蜂的东西当作蜜蜂
or things that are obviously not bees, it’s confident are.
由于教师程序不能教导
Since teacher bot can’t teach,
监督人员能做的也只有给它更多的问题
all the human overseer can do is give it more questions,
让测试更长一点
to make the test even longer,
加入那些表现最好的程序也会搞错的问题
to include the kinds of questions the best bots get wrong.
理解这个问题很关键
This is important to understand.
这就是为什么一些公司迷恋于收集数据
It’s a reason why companies areobsessed with collecting data.
更多数据等于更长的测试等于更好的程序
More data equals longer tests equals better bots.
所以当你在网上做“你是人类吗”的测试时
So when you get the”Are you human?” test on a website,
不仅仅是在证明你是人类(但愿)
you are not only proving that you are human,(hopefully),
同时也在帮助创造测试 让程序学习读
but you are also helping to build the test to make bots that can read,
或者计数
or count,
或分辨湖和山 马和人
or tell lakes from mountains, or horses from humans.
最近看到很多关于驾驶的问题了吗?嗯……
Seeing lots of questions about driving lately? Hmm…!
这些是为了测试什么呢?
What could that be building a test for?
现在分辨图片内容
Now figuring out what’s in a photo,
或者符号 或者过滤视频
or on a sign, or filtering videos,
需要人类做出足够正确的判断
requires humans to make correct enough tests.
但是有一种测试不需要人类的判断
But there is another kind of test that makes itself.
即对人类的测试
Tests ON the humans.
比如 假设NetMeTube想要
For example, say entirely hypothetical NetMeTube wanted
尽可能延长用户观看时间
users to keep watching as long as possible? Well,
用户停留时间是很容易测量的
how long a user stays on the site is easy to measure. So,
所以 教师程序让每个学生程序
teacher bot gives each student bot a bunch
监测一批用户
of NetMeTube users to oversee,
学生程序观察它们的用户在看什么
the student bots watch what their user watches,
查阅他们的档案
looks at their files,
尽最大努力选择视频
and do their best to pick the videos
让用户留在网站
that keep the user on the site.
用户停留时间越长 它们得分越高 创造
The longer the average, the higher their test score. Build,
测试 重复
test, repeat.
一百万次后
A million cycles later,
有一个学生程序能让顾客观看很长时间
there’s a student bot who’s pretty good at keeping the users watching,
至少是比人类能做的好很多
at least compared to what a human could build.
但是如果人们问:NetMeTube的算法是如何选择视频的?
But when people ask:”How does the NetMeTube algorithm select videos?”
再一次 最好的答案是
Once again, there isn’t a great answer other
这个程序本身
than pointing to the bot,
和它所有的用户数据
and the user data it had access to,
最重要的
and most vitally,
是监督人员让教师程序给测试打分的方式
how the human overseers direct teacher bot to score the test.
这是程序为了留存下来所要努力去做的
That’s what the bot is trying to be good at to survive.
但是算法在想什么
But what the bot is thinking,
或者它是如何思考的 不得而知
or how it thinks it, is not really knowable.
可以知道的是
All that’s knowable is this
学生程序就是算法
student bot gets to be the algorithm,
因为它在人类设计的测试中
because it’s point one percent better than the previous bot
得分比之前的程序高0.1%
at the test the humans designed.
所以在屏幕之后 互联网的各个地方
So everywhere on the internet, behind the scenes,
存在各种测试 去提高用户的交互体验
there are tests to increase user interaction,
或者设定能获得最高利润的价格
or set prices just right to maximize revenue,
或者从你所有的朋友中
or pick the posts from all your friends you
选出你最可能喜欢的
‘ll like the most, or articles people will
或人们最可能分享的文章 以及其它的东西
share the most, or whatever.
能测试的东西就能教授 呃 “教授”
If it’s testable, it’s teachable. Well,”teachable”,
学生程序终会
and a student bot will graduate
从大仓库毕业 成为它所在领域的算法
from the warehouse to be the algorithm of its domain.
至少 暂时是
At least, for a little while.
我们习惯于认为我们所使用的工具
We’re used to the idea that the tools we use,
即使自己不理解 总有人理解
even if we don’t understand them, someone does,
但是由于机器学习 我们越来越
but with our machines that learn we are increasingly
习惯使用工具
in a position where we use tools,
或被工具使用 但是没有人懂
or are used by tools, that no one,
即使是它们的创造者
not even their creators, understand.
我们只能希望通过测试
We can only hope to guide them
去引导它们
with the tests we make,
而且我们必须接受这种处境
and we need to get comfortable with that,
因为我们的算法朋友无处不在 并且不会离开
as our algorithmic bot buddies are all around, and not going anywhere. OK.
好 程序在看着你
The bots are watching.
你知道接下来要怎样
You know what’s coming.
所以我需要你们去……
This is where I need to ask you…
点赞……
To like…
评论……
comment…
并订阅
…and subscribe.
私信我
And bell me.
在推特上分享
And share on the TweetBook.
算法在看着呢
The algorithm is watching.
除非你这么做了
It won’t show people the video…
否则它不会把视频推荐给别人
unless you do this.
算法 看看你都让我做了些什么
Look what you’ve reduced me to, bots.
你要怎样?想要观看时间吗?
What do you want? Do you want watch time?
这就是你想要的吗?
Is that what you want?
好吧
Fine.
(唉……)嘿 朋友们 你们知道我还有播客帐号吗?
(sigh…) Hey guys, did you know I also have podcasts you can listen to?
可以当背景音听一听
Maybe even just in the background
在你们长时间地打扫房间
while you’re tidying up your all room for hours?
或者干其它什么事的时候
Or whatever?
我为你们准备了几个小时的音频
There’s hours of audio entertainment for you,
让程序有时间监督你们的行为
and watch time for the bots overseeing your actions.
去吧 点进去 好好享受
Go ahead and – and take a click.Entertain yourself.
帮帮我
Help me.
帮帮那些程序
Help the bots.

发表评论

译制信息
视频概述

算法原理

听录译者

收集自网络

翻译译者

moon.

审核员

审核员D

视频来源

https://www.youtube.com/watch?v=R9OHn5ZF4Uo

相关推荐