《机器学习python实践》#11 编写R平方 – 译学馆

• #### 科普

SCIENCE

#### 英语

ENGLISH

#### 科技

TECHNOLOGY

MOVIE

FOOD

#### 励志

INSPIRATIONS

#### 社会

SOCIETY

TRAVEL

#### 动物

ANIMALS

KIDS

#### 卡通

CARTOON

#### 计算机

COMPUTER

#### 心理

PSYCHOLOGY

#### 教育

EDUCATION

#### 手工

HANDCRAFTS

#### 趣闻

MYSTERIES

CAREER

GEEKS

#### 时尚

FASHION

• 精品课
• 公开课
• 欢迎下载我们在各应用市场备受好评的APP

点击下载Android最新版本

点击下载iOS最新版本 扫码下载译学馆APP

#### 《机器学习python实践》#11 编写R平方

Programming R Squared - Practical Machine Learning Tutorial with Python p.11

Hello everybody and welcome to part 11 of our machine learning tutorial series.

In this video we’re going to be building on the last one which is where we learned

how to calculate the R squared value or the coefficient of determination value.
R平方的值是用来衡量最优拟合线的准确度的
And this value is the…It’s the value of how good of a fit is our best fit line.

Okay. So that is the equation. Now how do we actually go about calculating it.

So a big part of this equation is actually squared error.

So we’re going to create a new function that calculates squared error.

So we’re just going to add it down here. And so this will be defined squared-error.
squared_error 计算了 y_original
And the squared error is the difference between the

ys_original or a ys_orig, we’ll say

And the ys_line. So recall that squared error is

the distance between whatever line is in question

and the points. It’s the amount of y distance.

That’s the error and then we square that error.

So we need the actual points and the actual spot on the y line.

So that is that. Do then what we’re going to say is the…

Let’s actually here what we need to do is return the sum
sum(ys_line – ys_orig) 就可以了
of the…Let’s see sum of the ys_line minus the ys_original–ys_orig. OK.

Squared.

So that would be your squared error for the entire line.

And in…you know this equation is relatively simple.

But I want to give it a function of squared error just so I think it’s a little easier to call upon but

feel free to to do what you want. But of course

you can get the original data points by just ys_original.

And then you can get the line because we know what mx and b or……I mean…well we know it.

mx and b are. But we got m and b already here. So we would just plug in

mx and b to get the y line of any y original point.

So anyway.

In our case that’s all we are plotting since we created our regression line

only using the x’s for x in xs.

Anyways. So that’s our squared error.

Now we need to calculate coefficient of determination which again

is just 1 minus the squared error of the y hat line divided by the squared error
ys均值的平方误差
of the mean of the y’s.

So we can calculate that. Define coefficient_of_determination().

That’s ys_orig and then ys_ line that should be a comma.

We calculate that by saying the y_mean_line equals
mean() 这里我们得用方括号
the mean of…actually what we need to do is brackets

This is mean(ys_orig)
for y in ys_orig
for y in ys_orig.

So that will make our y_mean_line. That’s just…it’s just it will make a

single value and each value is just the mean of y

for every y that we have in the original line. So that’s our y mean line.
squared_error_of_regression_line =
squared error of the regression line is equal to

I’m just going to copy and paste rather than typing this out. So copy that.

Paste. And then we’re gonna say the
squared_y_mean = squared_error(y_orig
the squared_y_mean equals squared_error y_orig.

And then instead of ys_line. It’s the y_mean_line.

And then finally we just return 1 minus

and then that would be 1 minus the squared error of the regression line.
(squared_error_of_regression_line / squared_error_y_mean) 对吧？
divided by the squared error of the mean line, right?
1减去回归线平方误差除以均值线平方误差
1 minus squared of regression divided by the squared error of the mean line.

Great. So there’s our coefficient of determination.

So now all we would have to do is…We might say something like

we might come down I don’t know here.

We could say R squared or you could call it coefficient of determination
= coefficient_of_determination
equals coefficient of determination.

And then you might have something like this: ys_regression_line.

So that’s the y’s original. That’s the line we’re curious about.

And we want to know the R squared value of that regression line. So we could print r_squared.

Save and run it.

And the value we get here is 0.58.

So just as a know if the regression line was…

If the val…Let’s say the regression line is as good as the y mean.

Then our value here would be 0, right? To be 1 minus basically a whole number 1, right?

So you know…Anything you know like…You can’t just say anything about 50% is more accurate

But anything above 0 means the regression line was more accurate.

Now you kind of have to make your own determination of

what kind of coefficient of determination line you’re looking for.

In this case we get 0.58 which is obviously it’s significantly more accurate

because you know to get 0.58 the equation would have to be 0.42.

So to be 0.42 that would be
100分之42 那这个平方误差就非常小了 对吧？
basically like 42 out of 100. So the squared error is much less, right?

So anyways squared error and coefficient of determination is not

the only calculation of how accurate the best fit line is.

But it is A calculation of

how good of a fit the best fit line is.

So in the next tutorial what we’re going to do is build

some sample data or in test our everything we got so far.

All our algorithm and all that. There’s a lot of math that’s involved here. It’s basic math

but it’s a lot of math involved. We need to have some sort of way to figure out

if everything is right like 0.58. Something could be totally wrong here. We wouldn’t really have any way to

figure out how it’s wrong. Other than maybe doing it by hand or something like that.

So in the next tutorial we’re gonna be talking about testing all of our assumptions

and sample data and stuff like that.

If you have questions comments leave them below. Otherwise until next time.

[B]刀子

[B]刀子