ADM-201 dump PMP dumps pdf SSCP exam materials CBAP exam sample questions

《机器学习Python实践》#15 欧几里得距离 – 译学馆
未登录,请登录后再发表信息
最新评论 (0)
播放视频

《机器学习Python实践》#15 欧几里得距离

Euclidean Distance - Practical Machine Learning Tutorial with Python p.15

大家好 欢迎来到Python机器学习系列第15讲
What is going on everybody and welcome to part 15 of the machine learning with Python tutorial series.
这一节我们要来用代码实现
In this tutorial we’re going to be building on the last couple
前两节我们讨论过的K最近邻算法
which we’re talking about K nearest neighbors.
我们已经直观地认识过了这个算法
We talked about the intuition. It’s basically
基本上就是找出离预测点最近的K个点
how close is this point to the K closest points.
这K个点主要属于哪个分类
And whatever the majority classes of those K closest points.
那么这个预测点就属于哪个分类
We say this new point is that class.
上一节我们做的就是
And in the previous tutorial exactly what we did was
将K最近邻算法运用到真实数据上
apply K nearest neighbors to real world data set.
最后结果的准确性还挺高的
We found that it’s actually fairly accurate which was very cool.
那这节我们就来把K最近邻算法分解了
So now we’re going to break down the K nearest neighbors algorithm
然后用代码实现它
and rewrite it ourselves from scratch in code.
不过首先我们要讲清楚主干 是吧?
But first we have to cover what everything hinges on, right?
主干就是距离
It hinges on this distance.
什么距离呢?就是欧几里得距离
So what is that distance? It is Euclidean distance.
那什么又是欧几里得距离呢?
So what is Euclidean distances?
当然这个术语是用著名数学家欧几里得命名的
Of course named after Euclid famous mathematician,
他被称之为几何学之父
popularly referred to as the father of geometry.
写了本很有名的书 对吧?就是《几何原本》
Definitely wrote the book on it, right? Euclid’s elements
可能算是数学家和科学家的圣经了
which is arguably the Bible for mathematicians and scientists.
还有个有意思的事情是
Also fun fact is you know
只要有人想创造一台印刷机
whenever someone would create a printing press
第一本想要印的书应该就是圣经了
the first thing you’d start popping out was the Bible of course.
第二本很可能就是欧几里得的《几何原本》了
And then the second thing was most likely Euclid’s elements.
那说到底什么是欧几里得距离呢?
So anyways what is Euclidean distance?
首先就是n的连加
First we have is the sum to n.
在这里n代表了数据的维度
And in this case ‘n’ represents the number of dimensions in your data.
所以这里就把它当做维度
So just think of it as in this case as dimensions
不过这个符号的意思是i从1开始
but really this means sum to n where i
连加到n的和
starts off at least as being equal to 1.
是的 就是i从1开始一直到n
OK. So it really just means i starts as 1 goes up to n where i actually
n就是维度
is your dimensions.
如果数据是1维的 那这里加一次就够了
So if you just have one dimension. It would just…you would just do this one time.
不过这是什么的和呢?
And it’s the sum of what?
我们在这把括号加上 这里应该是
And in…Let’s do parentheses here. It is going to be
(qi – pi)²
(qi – pi)²
最后的结果再开方
And then this entire calculation we do the square root of it.
这就是欧几里得距离了
And this is Euclidean distance.
那i就是维度 q是其中一个点 p是另一个点 对吧?
So i is just the dimensions. q is one point. p is a different point, right?
所以理论上……如果不用n和i
So this would in theory if you just…if you got rid of n and i.
去掉这些连加运算 就留一个括号 那么
You got rid of the whole sum and you just left the parentheses part like…if you just left…
我不太喜欢这么绕来绕去
I hate to circle legs and arms to mess it up.
但是这里如果只留下括号和平方
But if you just left the whole parenthesis squared.
那就是在计算
This would be…that would be the calculation for
1维空间上两点间的欧几里得距离
a one-dimensional distance between two points in Euclidean space, anyways.
接下来我们就把这些分解为简单的数学运算
But now let’s actually break this down into simple mathematics.
我还是喜欢自己动手去做一些大多数人不喜欢干的事
I always like to do it by hand for some things we won’t always do by hand.
当然我们不用做具体计算 不过最后我还是会教你怎么去计算
We won’t actually do the calculation but I’ll show you how you would plug it in at least.
那我们就开始吧 比方说q是(1,3)
So we’ll start off and say q is equal to 1,3.
这些数字就是这个点的坐标
So these are the coordinates for our data point.
那么……p的坐标x和y 这是2维的情况
And then p…the coordinates for p, x and y, so this is two dimensions
就是(2,5) 这就是坐标
is 2,5. Those are the coordinates.
那接下来我们怎么计算欧几里得距离?
So then how would we calculate the Euclidean distance?
基本上就是一堆东西的平方根
Well it’s going to be the square root of
因为我们这里的数据是2维的
basically a couple things. So we know we have two dimensions.
那我们基本上就知道怎么计算了 差不多就是……
So we know that basically what’s going to start off as. Will be something like…
就是平方根 没有这个点
it’ll be you know the square root without that dot there.
这里至少有两个括号
square root and we know we’re going to have at least two of these.
对吧?因为是2维数据嘛
Right? Because we’ve got two dimensions.
这里我们要需要它们的和 所以这里是加号
Here we recall it’s the summation of these. So it’ll be a plus here.
然后是两个平方 接着里面
And then this will be squared and this will be squared. And then we just need to fill in
放进去减法
the subtractions.
就是q1 – p1 也就是1 – 2 对吧?
So initially it’ll be q1 – p1. So it would be 1 – 2, right?
这里是1 然后这里是3 接着是减号 减号 2和5
1 and then over here we’ll just put a 3. And then it’s minus, minus, 2 and 5.
一起就得到了欧几里得距离
And that would be the Euclidean distance.
好的 很简单了 接下来让我们用Python实现这个
Okay. So simple enough. Let’s head over to Python and actually create this.
那就用Python
So in Python here,
来完整复现我们刚才讲的东西吧
let’s just recreate exactly what we just did by hand.
那这里就不用q和p了 用plot1
So instead of q and p let’s say plot1
等于……就是[1,3]
equals…and we’ll do 1,3.
plot2就等于[2,5] 可以吧?
And plot2 equals 2,5. Okay?
接下来……让我先回到开头 from math import sqrt
Now we’re going to…Let’s go up to the top and say from math import sqrt
就是引入平方根算符
which is just importing the square root.
然后回到下面这
So coming back down here.
把这里转换成欧几里得距离
Converting this to Euclidean distance or basically
或者说计算两点间的欧几里得距离
calculating the Euclidean distance between these two plots
就是这样写
is the following.
euclidean_distance = sqrt
So euclidean_distance = sqrt
记着对这里的和开平方
So remember it’s the square root of the sum
就是两个点在每个维度上的差的和
of each of the dimensions minus that same dimension in each of the plots
记清楚我们可是要算两个点间的距离
or two plots. Really, you’re going to calculate distance between two plots.
那这里就是
So in this case it would be for example
plot1的第一个元素 也就是plot1的x坐标值减去plot2的x坐标值 对吧?
plot1 zeroeth element so the x of plot one minus the x of plot2, right?
减去plot2的x坐标值 没问题吧?这是一个维度
So minus plot2 and the x of the zeroeth. Okay? So that’s one.
当然这里要把它们加起来 也就是这个加号
And remember it was the sum of all of these. So it would be that plus
然后基本上就只是把0换成1 对吧?0改成1
and then basically the exact same thing only instead of the 0 it would be the 1, all right? So 0,1
你可以把这些当做维度 对吧?这就是维度0
So you can think of these as your dimensions, right? So this is dimension 0
这个是维度1 这里确实有两个维度
and this is dimension 1. So this is two dimensions as indeed it is.
其实就是我们之前用的那个等式里的i
So that would be the i in that equation. Just for the record.
好的 欧几里得距离 搞定 接下来……
So anyways Euclidean distance. Boom! Done. Let’s go ahead and…
哦对了 这些也要平方
Oh, these also need to be squared.
这里平方 还有这里平方
So that squared and this squared.
对吧?这里再平方 然后再平方 最后整体
Right? So that is squared. This is squared and then the entire operation is…
做一个平方根 接下来我们打印
We get to grab the square root of that. So now let’s print
euclidean_distance的值
the euclidean_distance.
结果就是2.2360等等这么一个数
So we get 2.2360 and so on.
这差不多就是欧几里得距离的代码了
But basically that is your Euclidean distance.
计算完欧几里得距离后
So now that we know how to calculate Euclidean distance.
整个问题的关键我们已经把握住了
We basically have the crux of everything we need
接下来可以做K最近邻算法了
to do K nearest neighbors.
不过我们还需要进行不少框架建立工作
But we have kind of like a lot of framework to create regardless.
那就是我们下一节课的内容啦
So that’s what we are going to do in the next tutorial.
建立一个可以输入数据集然后进行K最近邻分类的框架
Creating the framework that will take a data set and use K nearest neighbors to classify points.
如果你有任何问题或评论
So if you have any questions or comments up to this point.
就在下方留言吧 感谢各位的收看 支持和订阅 我们下期见
Feel free to leave them below. Otherwise as always thanks for watching. Thanks for all the support and subscriptions and until next time.

发表评论

译制信息
视频概述

本节课讲解了如何用Python代码来实现对欧几里得距离的计算。主要是复现上一节课的内容。

听录译者

[B]刀子

翻译译者

[B]刀子

审核员

审核员1024

视频来源

https://www.youtube.com/watch?v=hl3bQySs8sM

相关推荐