Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér.
One of the holy grail problems of machine learning research
is to achieve artificial general intelligence, AGI in short.
Deep Blue was able to defeat the genius Kasparov in Chess,
but it was unable to tell us what the time was.
Algorithms of this type we often refer to
as a weak AI, or narrow AI,a technique
that excels,or is maybe even
on a superhuman level at a task,
but has zero or no knowledge about anything else.
A key to extend these algorithms would be to design them
in a way that their knowledge
generalizes well to other problems.
This is what we call transfer learning,
and this collaboration between the Stanford AI
lab and Caltech goes by the name Neural Task Programming,
and tries to tackle this problem.
A solution to practically any problem we’re trying
to solve can be written as series of tasks.
These are typically complex actions,
like cleaning a table, or performing a backflip
that are difficult to transfer to a different problem.
This technique is a bit like divide and conquer type algorithms
that aggressively try to decompose big,
difficult tasks into smaller, more manageable pieces.
The smaller and easier to understand the pieces are,
the more reusable they are and the better they generalize.
Let’s have a look at an example.
For instance, in a problem where we need to pick and place objects,
this series of tasks can be decomposed into picking and placing.
These can be further diced into a series of even smaller tasks,
比如握紧 移动 以及释放动作
such as gripping, moving, and releasing actions.
However, if the learning takes place like this,
we can now specify different variations of these tasks,
and the algorithm will quickly understand
how to adapt the structure of these small tasks
to efficiently to solve new problems.
The new algorithm generalizes really well
for tasks with different lengths, topologies, and changing objectives.
If you take a look at the paper,
you’ll also find some more information on adversarial dynamics,
which lists some problem variants
where a really unpleasant adversary pushes things
around on the table from time to time
to mess with the program,
and there are some results
that show that the algorithm is able to recover
from these failure states quite well.Really cool.
Now, please don’t take this as a complete solution for AGI,
because it is a fantastic piece of work,
but it’s definitely not that.
However, it may be a valuable puzzle
piece to build towards the final solution.
This is research.
We advance one step at a time.
Man, what an amazing time to be alive.
Thanks for watching and for your generous support,
and I’ll see you next time!