Least Squares Regression Line

The predicted LSRL for one variable is defined as y^=β^0+β^1xThe actual correlation line for the population is defined as y=β0+β1x+ϵiwhere epsilon represents the natural variations in the population\text{The predicted LSRL for one variable is defined as } \hat{y} = \hat{\beta}_0 + \hat{\beta}_1 x \\ \text{The actual correlation line for the population is defined as } y = \beta_0 + \beta_1 x + \epsilon_i \\ \text{where epsilon represents the natural variations in the population}

r2 is the coefficient of determination which describes the strength of the line (or how accurate it is)

  • Can be used to interpret the data: “_% of the variance in y can be explained by x”

r represents the direction and the strength of the relationship (weak/moderate/strong, negative/posutive)

  • “There is a (strength) (nega/posi) relationship betweenn IV and DV. As IV increases, DV (increases/decreases) on average.”

r2 is resistant to outliers while r is not

Formulas for the LSRL:

image

Must make three assumptions to make a model:

  1. The experimental units are independent from each other
  2. The error follows a Normal distribution
  3. The residual plot should look random

Distributions: image image