《机器学习Python实践》#15 欧几里得距离 – 译学馆

• #### 科普

SCIENCE

#### 英语

ENGLISH

#### 科技

TECHNOLOGY

MOVIE

FOOD

#### 励志

INSPIRATIONS

#### 社会

SOCIETY

TRAVEL

#### 动物

ANIMALS

KIDS

#### 卡通

CARTOON

#### 计算机

COMPUTER

#### 心理

PSYCHOLOGY

#### 教育

EDUCATION

#### 手工

HANDCRAFTS

#### 趣闻

MYSTERIES

CAREER

GEEKS

#### 时尚

FASHION

• 精品课
• 公开课
• 欢迎下载我们在各应用市场备受好评的APP

点击下载Android最新版本

点击下载iOS最新版本 扫码下载译学馆APP

#### 《机器学习Python实践》#15 欧几里得距离

Euclidean Distance - Practical Machine Learning Tutorial with Python p.15

What is going on everybody and welcome to part 15 of the machine learning with Python tutorial series.

In this tutorial we’re going to be building on the last couple

which we’re talking about K nearest neighbors.

We talked about the intuition. It’s basically

how close is this point to the K closest points.

And whatever the majority classes of those K closest points.

We say this new point is that class.

And in the previous tutorial exactly what we did was

apply K nearest neighbors to real world data set.

We found that it’s actually fairly accurate which was very cool.

So now we’re going to break down the K nearest neighbors algorithm

and rewrite it ourselves from scratch in code.

But first we have to cover what everything hinges on, right?

It hinges on this distance.

So what is that distance? It is Euclidean distance.

So what is Euclidean distances?

Of course named after Euclid famous mathematician,

popularly referred to as the father of geometry.

Definitely wrote the book on it, right? Euclid’s elements

which is arguably the Bible for mathematicians and scientists.

Also fun fact is you know

whenever someone would create a printing press

the first thing you’d start popping out was the Bible of course.

And then the second thing was most likely Euclid’s elements.

So anyways what is Euclidean distance?

First we have is the sum to n.

And in this case ‘n’ represents the number of dimensions in your data.

So just think of it as in this case as dimensions

but really this means sum to n where i

starts off at least as being equal to 1.

OK. So it really just means i starts as 1 goes up to n where i actually
n就是维度

So if you just have one dimension. It would just…you would just do this one time.

And it’s the sum of what?

And in…Let’s do parentheses here. It is going to be
（qi – pi）²
(qi – pi)²

And then this entire calculation we do the square root of it.

And this is Euclidean distance.

So i is just the dimensions. q is one point. p is a different point, right?

So this would in theory if you just…if you got rid of n and i.

You got rid of the whole sum and you just left the parentheses part like…if you just left…

I hate to circle legs and arms to mess it up.

But if you just left the whole parenthesis squared.

This would be…that would be the calculation for
1维空间上两点间的欧几里得距离
a one-dimensional distance between two points in Euclidean space, anyways.

But now let’s actually break this down into simple mathematics.

I always like to do it by hand for some things we won’t always do by hand.

We won’t actually do the calculation but I’ll show you how you would plug it in at least.

So we’ll start off and say q is equal to 1,3.

So these are the coordinates for our data point.

And then p…the coordinates for p, x and y, so this is two dimensions

is 2,5. Those are the coordinates.

So then how would we calculate the Euclidean distance?

Well it’s going to be the square root of

basically a couple things. So we know we have two dimensions.

So we know that basically what’s going to start off as. Will be something like…

it’ll be you know the square root without that dot there.

square root and we know we’re going to have at least two of these.

Right? Because we’ve got two dimensions.

Here we recall it’s the summation of these. So it’ll be a plus here.

And then this will be squared and this will be squared. And then we just need to fill in

the subtractions.

So initially it’ll be q1 – p1. So it would be 1 – 2, right?

1 and then over here we’ll just put a 3. And then it’s minus, minus, 2 and 5.

And that would be the Euclidean distance.

Okay. So simple enough. Let’s head over to Python and actually create this.

So in Python here,

let’s just recreate exactly what we just did by hand.

So instead of q and p let’s say plot1

equals…and we’ll do 1,3.
plot2就等于[2，5] 可以吧？
And plot2 equals 2,5. Okay?

Now we’re going to…Let’s go up to the top and say from math import sqrt

which is just importing the square root.

So coming back down here.

Converting this to Euclidean distance or basically

calculating the Euclidean distance between these two plots

is the following.
euclidean_distance = sqrt
So euclidean_distance = sqrt

So remember it’s the square root of the sum

of each of the dimensions minus that same dimension in each of the plots

or two plots. Really, you’re going to calculate distance between two plots.

So in this case it would be for example
plot1的第一个元素 也就是plot1的x坐标值减去plot2的x坐标值 对吧？
plot1 zeroeth element so the x of plot one minus the x of plot2, right?

So minus plot2 and the x of the zeroeth. Okay? So that’s one.

And remember it was the sum of all of these. So it would be that plus

and then basically the exact same thing only instead of the 0 it would be the 1, all right? So 0,1

So you can think of these as your dimensions, right? So this is dimension 0

and this is dimension 1. So this is two dimensions as indeed it is.

So that would be the i in that equation. Just for the record.

So anyways Euclidean distance. Boom! Done. Let’s go ahead and…

Oh, these also need to be squared.

So that squared and this squared.

Right? So that is squared. This is squared and then the entire operation is…

We get to grab the square root of that. So now let’s print
euclidean_distance的值
the euclidean_distance.

So we get 2.2360 and so on.

But basically that is your Euclidean distance.

So now that we know how to calculate Euclidean distance.

We basically have the crux of everything we need

to do K nearest neighbors.

But we have kind of like a lot of framework to create regardless.

So that’s what we are going to do in the next tutorial.

Creating the framework that will take a data set and use K nearest neighbors to classify points.

So if you have any questions or comments up to this point.

Feel free to leave them below. Otherwise as always thanks for watching. Thanks for all the support and subscriptions and until next time.

[B]刀子

[B]刀子