Among other projects and you were doing lots of stuff
you get involved in some very heady questions
about the origins of truth on the Internet.
And this is where we’re getting folks,
because they… the work that Danny is describing now
in theory ultimately became a venture.
– Right? Metaweb. – Yep.
So that’s right so what I really thought is that
what we need to do is have a way
of representing the knowledge of the world
in a way that machines can get out them
and take advantage of it.
And that should be shared, everybody should be able to get at it.
That’s in some sense that if the human knowledge isn’t a shared resource.
Then what is I think, what if civilization been doing all these years.
It created a company that built this database
called freebase was free database.
– Uh-mm… – Never coming.
And the company basically took any kind of public knowledge
that we can get information about anything
and put it in a machine readable format.
We were kind of creating it.
With the idea that this is gonna be useful to the world,
we didn’t really have a business model.
Uh-mm… And we started building it up
and then it became useful to lots of different people
including in particular all the search engines.
So eventually Google bought it, of course.
And then, I… uh… got Google to agree it
后来 嗯 谷歌同意
keep it open for three years, but…
they only kept the part that was already open
and they started building it up.
And so now if there’s all the Google has something
called the knowledge graph which is beyond the evolution of this.
And it probably has about a hundred billion different entities.
So everybody in this room is in that graph.
The building… This building is in that graph.
Yeah, I took a screenshot earlier when you’re just Google know your house
and all of these different.
– Yeah, that that’s right. Know your house. – Including this event.
-是的 搜索这个房子 -包括这个项目
So yeah, this event is and…
And yeah, so… so….
Anything like a person a place in the vend.
Anything like that is in this huge knowledge base.
And they’re all the relationships between them all, so…
When you for instance print out a Google Map
that is rendered.
From the knowledge graph,
so the knowledge graph knows the bus schedules and it knows.
You know, the address, the restaurant and the traffic.
– Drawing all this information together around the thing – Yeah.
– that the searcher cares about. – That’s right.
So the map is just in some sense a custom
rendering of a piece of the knowledge graph for your particular purpose.
Yeah and also by the way,
I don’t know the this doesn’t have any ads on it, but…
the other thing is that the ads are also a lot
a lot of knowledge graph about what the products are about.
You know whether
you know, it probably has knowledge about… you specifically and so on.
So it’s going to be way beyond the kind of public knowledge
Just also, uh-mm… begin probably have very particular private knowledge about people too.
Now from Google’s perspective,
it’s safe to say that this is a quantum leap.
In terms of the original basis of it’s
sort of citation based search.
You know, model.
All of a sudden,
it is now providing this multi-dimensional search
that is drawing in way more richness.
Yeah, so that it still does the old kind of search.
So right now, when you let’s say I put in museums of New York.
Museums in New York, well it still does the old keyword search of searching
for pages that have the word museum
and the phrase New York.
But it doesn’t… If you say, uh-mm…mm…
an exhibition in Manhattan or something,
but you might have something as a museum in New York.
That actually didn’t use the word Museum in New York on the page.
But the knowledge graph knows that Manhattan
is in New York.
And it knows that, you know, exhibitions are in museums or may know something is a museum
Even if it doesn’t use the word museum in its title.
And so it’s actually able to pick that up even though it’s not.
It doesn’t have the keyword.
So that will play into the search results to come up.
It does a search that’s based on the semantics.
And of course that’s very important
because that kind of knowledge is completely language independent too.
So the same knowledge that informs your search in English
also inform somebody’s search.
Mandarin or Hindi something like that.
So the bad news is, so the good news is
you know, it’s turned out to be really useful.
There are these big representations of knowledge.
But the bad news is the whole idea that being this free open thing
that everybody was going to use
has actually become really just something
that is a competitive advantage of Google.
And now, you know,
other… other search engines and other companies will make their own.
I’m sure Apples working on it.
Amazon and you know every each of the big companies, IBM, Microsoft…
其他大公司还有亚马逊 IBM 微软
You know, they’ll each work on their own database.
But I think, so the world could go in one of two directions.
We could either have this serve oligarchy
of big companies that have been giant.
You know, knowledge bases that they use for proprietary advantage.
Or it could flip over,
and say it becomes a public resource.
That we could say, we want knowledge to be a public resource.
Uh-mm. And we want in particular knowledge that’s tied to who said
what is this not doesn’t real reason truth.
Remember since who said stop and that becomes then a resource for doing things like
sorting out what’s big news
or deciding what medical treatments.
What effects are in the scientific literature.
You know, things things like that really don’t align very well with commercial.
Right, and this is where underlay comes in… underlay in many respects
is your attempt to kind of reclaim this technology
for the… as the public good that you can initially envisioned it as.
Yeah, it’s… it’s my penance for having solved the other thing.
Well, so I’ve actually stuck on the screen here.
I thought there was a very nice paragraph
on the very simple underlay website,
which basically in written terms explains kind of what
what it’s attempting to do.
And it says the underlay aggregates statements
and reported observations along with citations of who made them
and who published them.
For example, it would not contain the bearer assertion
that Sudan’s population was 39 million in 2008,
but rather that Sudan’s population was provisionally 39 million in 2008
according to the UN statistics division in 2011
referencing students national census as reported
by its Central Bureau of Statistics and as contested by the southern People’s Liberation Movement.
Yeah, and it would do that in not in those words,
but in a kind of machine readable.
Right, so that those could be…
Ultimately, this is…
This version of what you were going at
becomes almost a kind of record of all of these observations over time.
And it can be tracked, you know.
So if we wanted to get to the heart of,
let’s say, you know, whether
In one of these hearings, we’ve just watched
somebody said what are the other we could trace it potentially
back to the first recorded instance.
Yeah, and if you take a problem like that,
I would regard that as an apple occasion of the under light
just like Google Maps as a drawing a map is, but…
If you take sorting through fake news
and you know, recognizing them when rumors getting in on control.
In order to do that,
you really need a very complex representation of who’s saying what.
So you can kind of trace well this person said that,
this person said this, this person said that .
Or you know, New York Times said that.
You know, the Drudge Report said that, you know.
There is something that needs to be built on top of the underlay.
That is essentially a network of trust for that purpose.
So, you know, somebody has to say “well okay, I trust New York Times”
when they trust Fox News or vice versa.
Or you know, I and…
And these would be organizations or individuals with some sort of framework of analysis
that would leverage the underlay and for interpretive purposes
And there’s gonna be for different purposes.
I mean, and you know the awful lot of the things that people argue about
I mean, you know is Taiwan, a Province of China
You know, if you’re doing something with the Chinese government,
you’ve got account of this one.
If you’re doing something with Taiwan,
you’re probably not gonna commit, you know.
So for some purposes, it is.
For some purposes, it isn’t.
what’s the truth of that?
Well, there is an exactly the truth.
It’s, you know, what’s the purpose? what’s the trust in it? And so on.
And in many of these,
so I sort of feel like the underlay is
in some sense is a piece of the plumbing that we need
to deal with the fact that the amount of information has become overwhelming.
No human can hold it all in their heads.
No, nobody can be sort of familiar with all the new sources
or things like that.
And then, that lets us build these things on top of it
where computers help us be smarter
in sort of navigate these networks of trust.
And… and so you’re conceiving of this challenge.
Uh…This is in the mid-early odds, right? And…
What was the… you know,
what was the first Inklings of an approach that technology
could provide to addressing this
and to kind of capturing the chain
if you will, of custody, of information.
So… so the idea was to build something that basically said
what the agreed on what the things you were talking about
the entities that you were talking about.
Let people make statements about the relationships between them,
but then have some provenance of who made those statements.
So that instead of recording that, you know,
the glass is sitting on the table you record
Danny said the glass is sitting on the table on such-and-such a day.
And then, then you want you to have all that information recorded.
Then that lets you, first of all, it lets you record the information
without worrying too much about whether it’s true.
– It’s true that I said that, – Right.
uh-mm, which is much easier to determine
than whether it’s true that the glass is actually on the table.
But then, it also lets you apply basically your idea of trust afterwards,
after you get more information about who I am.
Or later you find out I’m a liar,
or later you find out the class with someplace else
– can weigh those previous recordings against… – Exactly, yeah.
So… so it’s sort of… the idea is that
what we really need to do
is we need to separate up two things.
We need to separate the record of what different people said
and who said it the provenance of what was said.
And then, separately remember it, separately have in some sense a network of trust,
which is going to be different for different purposes
Ultimately there’s lots of kinds of knowledge
that I think really are fundamentally part of the public common,
the public good.
And I hope that those will end up in it
and I think it’s not as complicated as copyright law
where you know you’re taking the expression
of individual artists and things like that.
A fact is a fact.
It’s not copyright-able truth, you know.
Somebody figures out that, you know.
The… you know.
The geographical location of this building.
You know, that’s… that’s just a truth nobody owns that.
And…and really it’s everybody’s advantage to show that.
Among other projects and you were doing lots of stuff