ADM-201 dump PMP dumps pdf SSCP exam materials CBAP exam sample questions

如何解读基因组并组装人类 – 译学馆
未登陆,请登陆后再发表信息
最新评论 (0)
播放视频

如何解读基因组并组装人类

How to read the genome and build a human being | Riccardo Sabatini

接下来的16分钟里 我要带大家踏上一段旅程
For the next 16 minutes, I’m going to take you on a journey
这大概是全人类的终极梦想——
that is probably the biggest dream of humanity:
解读生命的基因编码
to understand the code of life.
对我来说早在很多很多年前
So for me, everything started many, many years ago
当我遇到了第一台3D打印机的时候 一切就开始了
when I met the first 3D printer.
它的理念非常棒
The concept was fascinating.
一台3D打印机需要三个要素:
A 3D printer needs three elements:
一些信息一些原材料和一些能量
a bit of information, some raw material, some energy,
它能生产任何原先没有的东西
and it can produce any object that was not there before.
我那时正在研究物理现象我正在走回家
I was doing physics, I was coming back home
我突然意识到实际上我早就知道3D打印机了
and I realized that I actually always knew a 3D printer.
每个人都知道
And everyone does.
那就是我妈妈
It was my mom.
(笑声)
(Laughter)
我妈妈拿这三个要素:
My mom takes three elements:
一点信息 这里指的是我爸和我妈的基因
a bit of information, which is between my father and my mom in this case,
同一种介质提供原材料和能量——那就是食物
raw elements and energy in the same media, that is food,
历时几个月 制造出了我
and after several months, produces me.
而我以前从来没有存在过
And I was not existent before.
除了震惊的发现我妈其实是台3D打印机以外
So apart from the shock of my mom discovering that she was a 3D printer,
我还立即被另一个部分吸引了
I immediately got mesmerized by that piece,
那就是第一个要素—信息
the first one, the information.
到底需要多少信息
What amount of information does it take
才能制造和组装一个人呢?
to build and assemble a human?
是要很多?还是很少?
Is it much? Is it little?
要存满多少个U盘?
How many thumb drives can you fill?
我一开始是学物理的
Well, I was studying physics at the beginning
我想如果把人近似于看成是一个巨型的乐高玩具
and I took this approximation of a human as a gigantic Lego piece.
小的乐高模块就像是原子——
So, imagine that the building blocks are little atoms
这里有氢原子 这边有碳原子 上面这有氮原子
and there is a hydrogen here, a carbon here, a nitrogen here.
按照最初的这个设定
So in the first approximation,
如果能够列出组成人类的所有原子
if I can list the number of atoms that compose a human being,
我就能组装出一个人
I can build it.
现在你可以大致计算一下
Now, you can run some numbers
得到的结果非常惊人
and that happens to be quite an astonishing number.
所需要的原子的总数
So the number of atoms,
全部存到U盘里面——即便是组装一个小婴儿
the file that I will save in my thumb drive to assemble a little baby,
用掉的U盘就能装满整个泰坦尼克号
will actually fill an entire Titanic of thumb drives —
再乘以2000倍
multiplied 2,000 times.
这就是生命的奇迹
This is the miracle of life.
从现在开始你每看到一个孕妇
Every time you see from now on a pregnant lady,
她的身上都聚集着最大量的信息
she’s assembling the biggest amount of information
是你前所未见过的大
that you will ever encounter.
不要谈大数据 不要谈以前听说过的任何事情
Forget big data, forget anything you heard of.
这是现今存在的最大信息量
This is the biggest amount of information that exists.
(掌声)
(Applause)
好在大自然比一个年轻的物理学家要聪明多了
But nature, fortunately, is much smarter than a young physicist,
在40亿年里把这些信息打包
and in four billion years, managed to pack this information
放进一个小晶体里 我们称之为DNA
in a small crystal we call DNA.
1950年我们第一次发现了它
We met it for the first time in 1950 when Rosalind Franklin,
当时一位伟大的科学家罗莎琳富兰克林女士
an amazing scientist, a woman,
给DNA拍了张照
took a picture of it.
但是我们用了超过四十年的时间最终才拨开人的细胞
But it took us more than 40 years to finally poke inside a human cell,
从里面拿出了这个晶体
take out this crystal,
展开它 第一次解读它
unroll it, and read it for the first time.
这个遗传密码由简单的字母表组成
The code comes out to be a fairly simple alphabet,
四个字母:A T C和G (碱基)
four letters: A, T, C and G.
要组装一个人你需要30亿个这样的字母
And to build a human, you need three billion of them.
三十亿
Three billion.
三十亿是多少?
How many are three billion?
光这么说大家可能都没概念
It doesn’t really make any sense as a number, right?
我在想怎么让自己表达得更清楚
So I was thinking how I could explain myself better
这些遗传密码的数量到底有多庞大
about how big and enormous this code is.
所以我需要点帮助
But there is — I mean, I’m going to have some help,
最合适来帮我介绍遗传密码的人
and the best person to help me introduce the code
实际上是第一个将基因排序的人 克雷格文特尔博士
is actually the first man to sequence it, Dr. Craig Venter.
让我们欢迎克雷格文特尔博士上台
So welcome onstage, Dr. Craig Venter.
(掌声)
(Applause)
不是他本人——
Not the man in the flesh,
这也是历史上的第一次
but for the first time in history,
是一个人类的基因组
this is the genome of a specific human,
一页一页一个字母一个字母的被打印出来
printed page-by-page, letter-by-letter:
总共262000页的信息量
262,000 pages of information,
四百五十千克 被美国船运到加拿大
450 kilograms, shipped from the United States to Canada
感谢布鲁诺·鲍登 Lulu.com网站 一个新兴公司做了所有这些事情
thanks to Bruno Bowden, Lulu.com, a start-up, did everything.
这真是令人赞叹的壮举
It was an amazing feat.
这些就是生命密码给人最直观的视觉感受
But this is the visual perception of what is the code of life.
现在 我是第一次做一些有趣的事情
And now, for the first time, I can do something fun.
我能戳进去从这里面挑一段来读一读
I can actually poke inside it and read.
我来找一本有意思的……比如这一本
So let me take an interesting book … like this one.
我放了书签在里面 这书太厚了
I have an annotation; it’s a fairly big book.
给你们看一下 生命的密码长什么样子
So just to let you see what is the code of life.
成千上万
Thousands and thousands and thousands
的字母
and millions of letters.
它们当然都有意义
And they apparently make sense.
让我来找一段特殊的部分
Let’s get to a specific part.
读给你们听:
Let me read it to you:
(笑声)
(Laughter)
“AAG AAT ATA”
“AAG, AAT, ATA.”
你们可能觉得像是听天书
To you it sounds like mute letters,
但是这个序列给了格雷尔眼睛的颜色
but this sequence gives the color of the eyes to Craig.
在看看书的另外一部分
I’ll show you another part of the book.
这一段稍微复杂一些
This is actually a little more complicated.
第14号染色体 书本编号132
Chromosome 14, book 132:
(笑声)
(Laughter)
可能和你们想的一样
As you might expect.
(笑声)
(Laughter)
“ATT CTT GATT”
“ATT, CTT, GATT.”
这个人很幸运
This human is lucky,
因为如果他在这个位上少了2个字母
because if you miss just two letters in this position —
30亿中的2个
two letters of our three billion —
他注定会患上一种非常可怕的疾病——
he will be condemned to a terrible disease:
囊肿性纤维症(cystic fibrosis)
cystic fibrosis.
目前没有治疗的方法 这是绝症 我们不知道如何治疗它
We have no cure for it, we don’t know how to solve it,
仅仅和我们是2个字母的区别
and it’s just two letters of difference from what we are.
这是一部鸿篇巨著
A wonderful book, a mighty book,
这本有力的书帮助我理解
a mighty book that helped me understand
一切向你们展示一些非凡的东西
and show you something quite remarkable.
你们都的每一个——组成我 我和你 你们——
Every one of you — what makes me, me and you, you —
只需要这些中的500万个
is just about five million of these,
半本书
half a book.
剩下的基因
For the rest,
我们都是完全相同的
we are all absolutely identical.
500页 涵盖了你的生命奇迹
Five hundred pages is the miracle of life that you are.
余下的 我们全都一样
The rest, we all share it.
讨论人与人差异的时候反思一下
So think about that again when we think that we are different.
我们有这么多共通的东西
This is the amount that we share.
现在我已经引起了你们的兴趣
So now that I have your attention,
下一个问题就是:
the next question is:
怎么去读取这些信息?
How do I read it?
怎么理解和运用它们?
How do I make sense out of it?
不管你在组装瑞典家具上有多在行
Well, for however good you can be at assembling Swedish furniture,
这么长的指令手册在你有生之年是不可能被破解的
this instruction manual is nothing you can crack in your life.
(笑声)
(Laughter)
因此在2014年的时候两位著名的TED演讲者
And so, in 2014, two famous TEDsters,
彼得迪亚芒蒂思和克雷格文特尔本人
Peter Diamandis and Craig Venter himself,
决定成立一个新公司
decided to assemble a new company.
人类长寿公司就此诞生了
Human Longevity was born,
唯一的任务——
with one mission:
尽我们所能
trying everything we can try
解读出所有我们能在这些书本里读到的东西
and learning everything we can learn from these books,
只为达到一个目的:
with one target —
让个人化医疗成为现实
making real the dream of personalized medicine,
明白怎么做才能提高人类健康水平
understanding what things should be done to have better health
了解这些书目背后的秘密
and what are the secrets in these books.
一个惊人的团队拥有四十名数据科学家和越来越多的人
An amazing team, 40 data scientists and many, many more people,
和他们一起工作十分愉快
a pleasure to work with.
实际上工作流程不很复杂
The concept is actually very simple.
我们用一种叫做机器学习的方法
We’re going to use a technology called machine learning.
一方面 我们有几千个基因组
On one side, we have genomes — thousands of them.
另一边我们建立一个超大的人类信息数据库:
On the other side, we collected the biggest database of human beings:
性状 3D扫描 核磁共振——所有你能想到的
phenotypes, 3D scan, NMR — everything you can think of.
在内部 在这两个端点之间
Inside there, on these two opposite sides,
有神秘的翻译在进行
there is the secret of translation.
我们在中间建了一个机器
And in the middle, we build a machine.
建好之后我们训练这台机器——
We build a machine and we train a machine —
实际上不只一台机器而是很多台
well, not exactly one machine, many, many machines —
试图去理解基因组并把它翻译成性状
to try to understand and translate the genome in a phenotype.
有哪些字母——它们控制什么性状——
What are those letters, and what do they do?
这是普适的方法 可以用在所有问题上
It’s an approach that can be used for everything,
但用在基因组学上异常的复杂
but using it in genomics is particularly complicated.
一点一点有了进展后我们想要尝试更有挑战性的东西
Little by little we grew and we wanted to build different challenges.
最开始我们从常见的特征下手
We started from the beginning, from common traits.
常见特征最容易因为它们太常见了
Common traits are comfortable because they are common,
每个人都有
everyone has them.
我们开始提出如下问题
So we started to ask our questions:
能预测身高吗?
Can we predict height?
能不能解读这些书本信息来预测身高?
Can we read the books and predict your height?
是的我们可以
Well, we actually can,
存在五厘米的误差
with five centimeters of precision.
身体质量指数主要跟生活习惯密切有关
BMI is fairly connected to your lifestyle,
但我们仍然能预测得差不多存在8千克上下的误差
but we still can, we get in the ballpark, eight kilograms of precision.
眼睛的颜色能不能预测?
Can we predict eye color?
是的我们可以
Yeah, we can.
80%的正确率
Eighty percent accuracy.
皮肤颜色?
Can we predict skin color?
可以 80%的正确率
Yeah we can, 80 percent accuracy.
我们可以预测年龄吗?
Can we predict age?
可以 因为很明显基因随着年龄产生变化
We can, because apparently, the code changes during your life.
DNA 会变短 缺失一些片段或者插入另外一些片段
It gets shorter, you lose pieces, it gets insertions.
我们读取这些信号然后建立模型
We read the signals, and we make a model.
现在来个有意思的挑战:
Now, an interesting challenge:
我们能不能预测人的面孔?
Can we predict a human face?
这个略有点复杂
It’s a little complicated,
因为有几百万个碱基都对人脸产生影响
because a human face is scattered among millions of these letters.
而且人脸并不是一个构造十分精准的物体
And a human face is not a very well-defined object.
所以必须要建立一整个单独的模块
So, we had to build an entire tier of it
给机器去训练和学习人脸是什么
to learn and teach a machine what a face is,
再把这个模块压缩整合进去
and embed and compress it.
如果你对机器学习有点概念的话
And if you’re comfortable with machine learning,
就能够想象这个挑战是有多大
you understand what the challenge is here.
现在15年过去了——15年前我们读取第一条序列
Now, after 15 years — 15 years after we read the first sequence —
——今年10月 我们总算有了些进展
this October, we started to see some signals.
当时还是很激动人心的
And it was a very emotional moment.
你现在看到的东西来自于我们的实验室
What you see here is a subject coming in our lab.
这是我们的一张脸
This is a face for us.
我们要对测试对象的面孔进行简化
So we take the real face of a subject, we reduce the complexity,
因为并不是所有的特征都是面孔的一部分——
because not everything is in your face —
很多特点 缺陷和不对称是生活的痕迹
lots of features and defects and asymmetries come from your life.
把面孔调整对称之后 跟我们运算的结果比较
We symmetrize the face, and we run our algorithm.
我刚才给你看的结果
The results that I show you right now,
是我们根据血液预测的
this is the prediction we have from the blood.
(掌声)
(Applause)
等一下——
Wait a second.
你们的眼睛正在左右两边交替看,
In these seconds, your eyes are watching, left and right, left and right,
大脑希望两幅图是一模一样的
and your brain wants those pictures to be identical.
坦诚来说 我其实想请大家做另一件事情
So I ask you to do another exercise, to be honest.
找找两幅图的不同点
Please search for the differences,
其实非常多
which are many.
性别提供最多的信息
The biggest amount of signal comes from gender,
接下来是年龄 BMI(体质指数)和种族
then there is age, BMI, the ethnicity component of a human.
再考虑更多因素会变得更加复杂
And scaling up over that signal is much more complicated.
但是这样的结果即便有很多不同
But what you see here, even in the differences,
表示我们在正确的范围内
lets you understand that we are in the right ballpark,
正在逐步接近
that we are getting closer.
它已经给了你一些情绪反应
And it’s already giving you some emotions.
这是另外一个测试对象
This is another subject that comes in place,
这边是预测结果
and this is a prediction.
脸小了一点 完整的颅骨结构没预测到
A little smaller face, we didn’t get the complete cranial structure,
但至少像那么回事
but still, it’s in the ballpark.
这是又一个测试对象
This is a subject that comes in our lab,
这是预测结果
and this is the prediction.
机器接受训练时 它们从未看见过这些面孔
So these people have never been seen in the training of the machine.
这就是所谓的随机测试组
These are the so-called “held-out” set.
并且你们不认识这些人 可能说服力不太够
But these are people that you will probably never believe.
我们在学术期刊上发表了这些结果
We’re publishing everything in a scientific publication,
你们可以去读一下
you can read it.
但既然我们在台上 克里斯向我提出挑战
But since we are onstage, Chris challenged me.
我尽我所能暴露自己 尝试着预测
I probably exposed myself and tried to predict
某个你可能认识的人
someone that you might recognize.
这里有一小管血液——你们很难想象
So, in this vial of blood — and believe me, you have no idea
我们为了一管血液我们花了多少工夫
what we had to do to have this blood now, here —
这一小管血液中蕴含了大量的生物信息
in this vial of blood is the amount of biological information
我们需要做一个完整的基因组排序
that we need to do a full genome sequence.
只需要这么多
We just need this amount.
我们已经完成了测序 下面我和你们一起做
We ran this sequence, and I’m going to do it with you.
我们综合了所有已知的信息
And we start to layer up all the understanding we have.
从这一管血液里 我们预测这是一名男性
In the vial of blood, we predicted he’s a male.
被试者正是一名男性
And the subject is a male.
我们预测他身高1米76
We predict that he’s a meter and 76 cm.
被试身高1米77
The subject is a meter and 77 cm.
预测他体重76kg 被试者是82kg
So, we predicted that he’s 76; the subject is 82.
我们还预测了他的年龄 38岁
We predict his age, 38.
被试者实际上是35岁
The subject is 35.
我们预测了他眼睛的颜色
We predict his eye color.
非常深的黑色
Too dark.
我们预测他的皮肤颜色
We predict his skin color.
我们基本上是准确的
We are almost there.
这是他的面孔
That’s his face.
现在到了揭晓的时刻:
Now, the reveal moment:
被试对象是这个人
the subject is this person.
(笑声)
(Laughter)
我是有意拿自己做测试的
And I did it intentionally.
我属于一个特别又特殊的种族
I am a very particular and peculiar ethnicity.
南欧人 意大利人——从来都不符合模型预测
Southern European, Italians — they never fit in models.
而且这一种族在模型里是一个复杂的边界情况
And it’s particular — that ethnicity is a complex corner case for our model.
但还有另一个重点
But there is another point.
最常用的来辨识人的方法
So, one of the things that we use a lot to recognize people
不是由基因组编译的
will never be written in the genome.
是人们的自由意志 即我想让自己看起来怎么样
It’s our free will, it’s how I look.
虽然我的发型不是我自己决定的 但胡子是的
Not my haircut in this case, but my beard cut.
下面我们来看一下 我要进行下改变——
So I’m going to show you, I’m going to, in this case, transfer it —
单纯的用photoshop 不用建模——
and this is nothing more than Photoshop, no modeling —
把胡子加上去
the beard on the subject.
是不是立即觉得变得很相像了
And immediately, we get much, much better in the feeling.
因此 为什么我们要这样做?
So, why do we do this?
当然不是为了预测身高
We certainly don’t do it for predicting height
或者描绘出你没有胡子时的完美照片
or taking a beautiful picture out of your blood.
我们研究是因为同样的技术和手段
We do it because the same technology and the same approach,
基因组的学习机器
the machine learning of this code,
能帮助我们了解人类自身是如何工作的
is helping us to understand how we work,
你的身体是如何协调工作的
how your body works,
你的身体是如何变老的
how your body ages,
疾病在你的身体里是如何产生的
how disease generates in your body,
癌症是怎么出现和恶化的
how your cancer grows and develops,
药物是如何起作用的
how drugs work
药物是不是能够对你的身体起作用
and if they work on your body.
这是一个巨大的挑战
This is a huge challenge.
这是一个
This is a challenge that we share
我们和世界各地其他成千上万的研究者们一起面临的挑战
with thousands of other researchers around the world.
它被称为 个性化医疗
It’s called personalized medicine.
从只能借助统计学方法
It’s the ability to move from a statistical approach
每个人都只是沧海一粟
where you’re a dot in the ocean,
到能够实现有针对性的治疗
to a personalized approach,
通过解码这些基因信息
where we read all these books
我们能够彻底了解每一个人
and we get an understanding of exactly how you are.
但这是一项异常复杂的挑战
But it is a particularly complicated challenge,
因为到目前为止在这么庞大的基因组信息中
because of all these books, as of today,
我们大概只了解2%:
we just know probably two percent:
175本书里的4本
four books of more than 175.
当然这不是我今天演讲的主题
And this is not the topic of my talk,
因为我们会了解更多
because we will learn more.
有很多顶尖的人才在从事这项工作
There are the best minds in the world on this topic.
预测会越来越准确
The prediction will get better,
模型会越来越精准
the model will get more precise.
随着了解的逐渐深入
And the more we learn,
我们需要做的决定会越来越多
the more we will be confronted with decisions
而且是一些从前没有想象过的决定
that we never had to face before
关于生命
about life,
关于死亡
about death,
关于养育
about parenting.
因此 我们正在不断接近基因内部的细节以解开生命机体如何工作之谜
So, we are touching the very inner detail on how life works.
这是一项重要的革命
And it’s a revolution that cannot be confined
它不能被限制于科学技术领域
in the domain of science or technology.
这是一个全球性的会话
This must be a global conversation.
我们必须思考我们的未来 要结合起全人类的力量
We must start to think of the future we’re building as a humanity.
我们需要和创意人员 艺术家 哲学家 和政治家
We need to interact with creatives, with artists, with philosophers,
一起相互讨论和影响
with politicians.
每一个人都被包含在内
Everyone is involved,
因为这是我们人类物种的未来
because it’s the future of our species.
不要害怕 但是我们要明白
Without fear, but with the understanding
我们接下来一年中所做的决定
that the decisions that we make in the next year
都会彻底改变历史的进程
will change the course of history forever.
谢谢
Thank you.
(掌声)
(Applause)

发表评论

译制信息
视频概述

该视频介绍了人类基因组的发现,研究过程和现状以及未来展望

听录译者

收集自网络

翻译译者

与光同尘

审核员

赖皮

视频来源

https://www.youtube.com/watch?v=s6rJLXq1Re0

相关推荐