Picture yourself as an early calculus student
about to begin your first course.
The months ahead of you hold within
them a lot of hard work:
Some neat examples,
some not so neat examples,
beautiful connections to physics,
not so beautiful piles of formulas to memorise,
plenty of moments of getting stuck and
banging your head into a wall,
a few nice ‘aha’ moments sprinkled in as well,
and some genuinely lovely graphical intuition
to help guide you through it all.
But if the course ahead of you is anything
like my first introduction to calculus or any of
the first courses that I’ve seen in the years since,
there’s one topic that you will not see,
but which I believe stands to greatly accelerate your learning.
You see almost all of the visual intuitions
from that first year are based on graphs –
the derivative is the slope of a graph,
the integral is a certain area under that graph,
but as you generalize calculus
beyond functions whose inputs and outputs are simply numbers,
it’s not always possible to graph the function that you’re analyzing.
There’s all sorts of different ways that
you’d be visualizing these things
so if all your intuitions for the fundamental ideas, like derivatives,
are rooted too rigidly in graphs,
it can make for a very tall and largely unnecessary
conceptual hurdle between you and the more”advanced topics”,
比如多元微积分 复分析 微分几何
like multivariable calculus, and complex analysis, differential geometry…
Now, what I want to share with you
is a way to think about derivatives
which I’ll refer to as the transformational view,
that generalizes more seamlessly into some of those more general context where calculus comes up
And then we’ll use this alternate view
to analyze a certain fun puzzle about repeated fractions.
But first off, I just want to make sure
that we’re all on the same page about what the standard visual is.
If you were to graph a function,
which simply takes real numbers as inputs and outputs,
one of the first things you learn in a calculus course
is that the derivative gives you the slope of this graph.
Where what we mean by that is that
the derivative of the function is a new function
which for every input x returns that slope.
Now I’d encourage you not to think of this derivative
as slope idea as being the definition of a derivative
instead think of it as being more fundamentally
about how sensitive the function is to tiny little nudges around the input and
the slope is just one way to think about that sensitivity relevant
only to this particular way of viewing functions.
I have not just another video,
but a full series on this topic
if it’s something you want to learn more about.
Now the basic idea behind the alternate visual
for the derivative is to think of this function
as mapping all of the input points
on the number line to their corresponding outputs on a different number line.
In this context what the derivative gives you
is a measure of how much the input space
gets stretched or squished in various regions.
That is if you were to zoom in around a specific input
and take a look at some evenly spaced points around it,
the derivative of the function of that input
is going to tell you how spread out or contracted
those points become after the mapping.
Here a specific example helps
take the function x squared
it maps 1 to 1 and 2 to 4
3 to 9 and so on
and you could also see how it acts on all of the points in between
and if you were to zoom in on a little cluster of points around the input 1
and then see where they land around the relevant output
which for this function also happens to be 1
you’d notice that they tend to get stretched out.
it roughly looks like stretching out by a factor of 2
and the closer you zoom in
the more this local behavior Looks just like multiplying by a factor of 2.
This is what it means for the derivative of x squared
at the input x equals 1 to be 2.
It’s what that fact looks like in the context of transformations.
If you looked at a neighborhood of points around the input 3,
they would get roughly stretched out by a factor of 6.
This is what it means for the derivative of this function at the input 3 to equal 6.
Around the input 1/4 a small
a small region actually tends to get contracted,
specifically by a factor of 1/2
and that’s what it looks like for a derivative to be smaller than 1.
Now the input 0 is interesting,
zooming in by a factor of 10
It doesn’t really look like a constant stretching or squishing,
for one thing all of the outputs end up on the right positive side of things
and as you zoom in closer and closer by 100x or by 1000 X
It looks more and more like a small neighborhood of points around zero
just gets collapsed into zero itself.
And this is what it looks like for the derivative to be zero,
the local behavior looks
more and more like multiplying the whole number line by zero.
It doesn’t have to completely collapse everything to a point
at a particular zoom level.
Instead it’s a matter of what the limiting behavior is
as you zoom in closer and closer.
It’s also instructive to take a look at the negative inputs here.
Things start to feel a little cramped
since they collide with where all the positive input values go,
and this is one of the downsides of thinking of functions as transformations,
but for derivatives,
we only really care about the local behavior anyway.
what happens in a small range around a given input.
Here, notice that the inputs in a little neighborhood around say negative two.
They don’t just get stretched out – they also get flipped around.
具体来说 随着放大 这段的映射变化
Specifically, the action on such a neighborhood
looks more and more like multiplying by negative four the closer you zoom in
this is what it looks like for the derivative of a function to be negative
and I think you get the point.
This is all well and good,
but let’s see how this is actually useful in solving a problem a
Friend of mine recently asked me a pretty fun
question about the infinite fraction one plus one divided
by one plus one divided by one plus one divided by one and on and on…
Clearly you watch math videos online So maybe you’ve seen this before
but my friend’s question actually cuts to something
that you might not have thought about before
Relevant to the view of derivatives that we’re looking at here
the typical way that you might evaluate an expression like this
is to set it equal to X
and then notice that there’s a copy of the full fraction inside itself
So you can replace that copy with another X
and then just solve for X
That is what you want is to find a fixed point
of the function 1 plus 1 divided by X
But here’s the thing
there are actually two solutions for X two special numbers
were one plus one divided by that number
Gives you back the same thing
One is the golden ratio phi Φ around 1.618
and the other is negative 0.618 which happens to be -1/φ.
I like to call this other number phi’s little brother
since just about any property that phi has, this number also has
And this raises the question:
Would it be valid to say that that infinite fraction that we saw,
is somehow also equal to phi’s little brother: -0.618?
Maybe you initially say, obviously not!
Everything on the left hand side is positive.
So how could it possibly equal a negative number?”
Well first we should be clear about
what we actually mean by an expression like this.
One way that you could think about it,
and it’s not the only way there’s freedom for choice here,
is to imagine starting with some constant like 1
and then repeatedly applying the function 1+1/x
then asking what is this approach as you keep going?
I mean certainly symbolically
what you get looks more and more like our infinite fraction
so maybe if you wanted to equal a number
you should ask what this series of numbers approaches
And if that’s your view of things,
maybe you start off with a negative number
So it’s not so crazy for the whole expression to end up negative.
After all If you start with -1/φ
then applying this function 1+1/x
You get back the same number -1/φ.
So no matter how many times you apply it
you’re staying fixed at this value.
But even then
there is one reason that you should probably view phi as the favorite brother in this pair,
here try this:
pull up a calculator of some kind then start with any random number
and then plug it into this function 1+1/x
and then plug that number into 1 + 1/x
and then again and again and again…
No matter what constant you start with you eventually end up at 1.618
Even if you start with a negative number
even one that’s really really close to phi’s little brother
Eventually it shys away from that value and jumps back over to phi
So what’s going on here?
Why is one of these fixed points favored above the other one?
Maybe you can already see
how the transformational understanding of derivatives
is going to be helpful for understanding this set up,
but for the sake of having a point of contrast,
I want to show you how a problem like this
is often taught using graphs.
If you were to plug in some random input to this function,
the y-value tells you the corresponding output, right?
So to think about plugging that output back into the function,
you might first move horizontally until you hit the line y equals x
and that’s going to give you a position where the x-value
corresponds to your previous y-value, right?
So then from there you can move vertically to see what output this new x-value has
And then you repeat
you move horizontally to the line y = x, to find a point whose x-value
is the same as the output that you just got and then you move vertically
to apply the function again.
Now personally, I think this is kind of an awkward way to think about repeatedly
applying a function, don’t you?
I mean it makes sense,
but you can’t have to pause and think
about it to remember which way to draw the lines,
and you can if you want
think through what conditions make this spiderweb process
narrow in on a fixed point versus propagating away from it
And in fact, go ahead pause right now
and try to think it through as an exercise. It has to do with slopes
Or if you want to skip the exercise for something that
I think gives a much more satisfying understanding
think about how this function acts as a transformation.
So I’m gon na go ahead and start here by drawing a whole bunch of arrows
to indicate where the various sample the input points will go,
and side note:
Don’t you think this gives a really neat emergent pattern?
I wasn’t expecting this,
but it was cool to see it pop up when animating.
I guess the action of 1 divided by x gives this nice emergent circle
and then we’re just shifting things over by 1.
Anyway, I want you to think about
what it means to repeatedly apply some function
like 1 + 1/x in this context.
Well after letting it map all of the inputs to the outputs,
you could consider those as the new inputs
and then just apply the same process again
and then again and do it however many times you want
Notice in animating this with a few dots representing the sample points,
it doesn’t take many iterations at all before all of those dots kind of clump in around 1.618.
we know that 1.618… and its little brother -0.618…
on and on stay fixed in place during each iteration of this process,
but zoom in on a neighborhood around phi
在映射过程中 Φ附近的点 会不断聚集到Φ上
during the map points in that region get contracted around phi
meaning that the function 1 + 1/x
has a derivative with a magnitude that’s less than 1 at this input in
Fact this derivative works out to be around -0.38.
So what that means, is that each repeated application
scrunches the neighborhood around this number smaller and smaller
like a gravitational pull towards phi.
So now tell me what you think
happens in the neighborhood of phi’s little brother.
Over there the derivative actually has a magnitude larger than one,
so points near the fixed point are repelled away from it
and when you work it out, you can
see that they get stretched by more than a factor of two in each iteration.
They also get flipped around because the derivative is negative here,
but the salient fact for the sake of stability is just the magnitude.
Mathematicians would call this right value a stable
fixed point and the left one is an unstable fixed point
Something is considered stable if when you perturb it just a little bit,
it tends to come back towards where it started rather than going away from it.
So what we’re seeing is a very useful little fact:
that the stability of a fixed point
is determined by whether or not the magnitude of its derivative is bigger or smaller than one
And this explains why phi
always shows up in the numerical play
where you’re just hitting enter on your calculator over and over
but phi’s little brother never does.
Now as to whether or not you want to consider phi’s little brother a valid value of the infinite fraction
Well, that’s really up to you.
Everything we just showed suggests that
if you think of this expression as representing a limiting process
then because every possible seed value other
than phi’s little brother gives you a series converting to φ
It does feel kind of silly to put them on equal footing with each other.
But maybe you don’t think of it as a limit
Maybe the kind of math you’re doing lends itself to treating this as a purely algebraic object
like the solutions of a polynomial, which simply has multiple values.
Anyway, that’s beside the point
and my point here is not that viewing derivatives as this change in density
is somehow better than the graphical intuition on the whole.
In fact picturing an entire function this way
can be kind of clunky and impractical as compared to graphs.
My point is that it deserves more of a mention
in most of the introductory calculus courses,
because it can help make a student’s understanding of the derivative a little bit more flexible.
Like I mentioned the real reason that I’d recommend
you carry this perspective with you as you learn new topics
is not so much for what it does with your understanding of single variable calculus
it’s for what comes after
after there are many topics typically taught in a college math department which…
How shall I put this lightly?
don’t exactly have a reputation of being super accessible.
So in the next video I’m gonna show you how a
few ideas from these subjects with fancy sounding
names like holomorphic functions and the Jacobian determinant
are really just extensions of the idea shown here.
They really are some beautiful ideas,
which I think can be appreciated
from a really wide range of mathematical backgrounds
and they’re relevant to a surprising number of seemingly unrelated ideas.
So stay tuned for that.
Now for the final animation
I just want to show you a little more of that time-dependent vector field I flashed earlier,
but first let’s look at some of the principles of learning
from this video sponsor: Brilliant.org
There’s a lot of good stuff on this list,
but I want you to look at number two
effective math and science learning cultivates curiosity.
I love the word choice here.
It’s not just that you should be curious in one moment
It means creating a context where that curiosity is constantly growing.
Just look at the infinite fraction example here
It would be one thing if you were curious about
why the numbers bounce around the way that they do,
but hopefully the conclusion is not just to understand this one example
I would want you to start looking at all sorts of other infinite expressions
and wonder if there’s some fixed point phenomenon in them,
or wonder where else this view of derivatives
can be conceptually helpful
Brilliant.org is a site where you can learn math and science topics
through active problem-solving
and if you go take a look I think you’ll agree
that they really do adhere to these learning principles,
coming from this video
you would probably enjoy their”Calculus Done Right,” lessons
and they also have many other courses in various math and science topics.
Much of it you can check out for free,
but they also have a subscription service
that gives you access to all sorts of nice guided problems.
Going to Brilliant.org/3B1B
Lets them know that you came from this channel
and it can also get you 20 % off of their annual subscription.