Elementary Regression

Paul E. Johnson

2015-02-18

This accompanies Regression lecture

Make sure you can plot a line in a 2-D space
Understand effect of changing the intercept

Draw this

\[ y = 2 + 1 x \]

Set the range of x from 0 to 10

This accompanies Regression lecture

Draw this

\[ y = 2 + 1 x \]

Now draw this on same picture

\[ y = 4 + 1 x \]

What is the difference of the two lines?
Can you tell me a story about some relationship that might “look like that”?

This accompanies Regression lecture

Understand effect of changing the slope

Draw

\[ y = 2 + 1 x \]

Now draw this on same picture

\[ y = 2 + 0 x \]

Line art of the Day

Muenchen1

http://r4stats.com/articles/popularity

Idle question. When would you ever prefer a Correlation to Regression?

Is “line fitting” with regression different from correlation analysis?

Here’s why I asked

ConwayWhiteScatter1

http://www.dataists.com/2010/12/ranking-the-popularity-of-programming-langauges/

Comments

I’m not certain we should teach “correlation” at all, by itself
Correlation speaks to a certain “prejudice” about relationships that I don’t hold

Why am I so luke warm about \(R^2\)

Personal Anxiety Alert!
Some people think \(R^2\) is summary of goodness in a regression (“model fit index”)
I just see it as a summary of the relationship between
- the variance of the predictor
- the variance of the disturbance
- the value of the slope

Categorical predictors…

Must be converted to numeric values for fitting in regression
A “human readable” variable {male, female} has to be converted into {0,1}

Why do they call that a Linear Estimator?

Clearly the regression model \(y_i=\beta_0 + \beta_{1} x_i + e_i\) is linear, but
That’s not what they mean when they say it is a linear estimator.

\[ \hat{\beta}_{1}^{OLS} = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2} \]

Make that more obviously “Linear”

Think of the values involving \(x_i\) as something we calculate, without knowing \(y_i\).

\[ \hat{\beta}_{1}^{OLS} =\sum_{i=1}^{N}\left(\frac{x_{i}-\bar{x}}{\sum(x_{i}-\bar{x})^{2}}\times y_{_{i}}\right) \]

The denominator is a “constant”, same for every \(i\). Call that \(SS_x\)

\[ SS_x = \sum (x_i - \bar{x})^2 \]

Lets call the part involving \(x_i\) by the letter \(h_i\) \[ h_i = \frac{1}{SS_x} \cdot (x_i - \bar{x}) \]

Make that more obviously “Linear”

These are weights, applied one-by-one in the sum.

Hence, \(\hat{\beta}\) is a weighted sum of the observed scores \[ h_1 \times (y_1 - \bar{y}) + h_2 \times (y_2 - \bar{y}) \]
In the paradigm that we are using, \(x_i\) is “fixed”, its a non random thing. The only observed random variable is \(y_i\).
That’s calculated from the predictors, we could cycle through the data, row by row, and calculate that number.

Easier to see with centered data

Rescale (mean center). Recode so \(\bar{x}=0\) and \(\bar{y}=0\), the formula is simpler:

\[ \hat{\beta}_{1}^{OLS}=\frac{\sum x_{i}y_{i}}{\sum x_{i}^{2}} \]

\[ \hat{\beta}_{1}^{OLS} = \sum\left(\frac{x_{i}}{\sum x_{i}{}^{2}}\times y_{i}\right) \]

Interestingly, the \(SS_x\) is the same, whether the data is centered or not. So this is

Easier to see with centered data

\[ \hat{\beta}_{1}^{OLS} = \sum\left(\frac{x_{i}}{SS_x}\times y_{i}\right) \]

\[ \hat{\beta}_{1}^{OLS} = \sum\left(h_i \times y_{i}\right) \]