IOAI ML Notes Classical Machine LearningSupervised Learning

Linear Regression

A comprehensive guide to Linear Regression: exploring how it models relationships between variables to make accurate continuous predictions.

Syllabus Map


Overview


Mathematical Formulation

y^i=wxi+b\hat{y}_i = w x_i + b

Cost Function

C=12mi=1m(y^iyi)2C = \frac{1}{2m} \sum_{i=1}^{m} (\hat{y}_i - y_i)^2 C=12mi=1m(wxi+byi)2C = \frac{1}{2m} \sum_{i=1}^{m} (w x_i + b - y_i)^2

Gradient Descent Optimisation

w=wαCww = w - \alpha \frac{\partial C}{\partial w} b=bαCbb = b - \alpha \frac{\partial C}{\partial b}

Derivation of Gradients

  1. Derivative of cost w.r.t.w.r.t. prediction:

    Cy^i=1m(y^iyi)\frac{\partial C}{\partial \hat{y}_i} = \frac{1}{m} (\hat{y}_i - y_i)
  2. Derivative of prediction w.r.t.w.r.t. parameters:

    y^iw=xi,y^ib=1\frac{\partial \hat{y}_i}{\partial w} = x_i, \quad \frac{\partial \hat{y}_i}{\partial b} = 1
  3. Applying chain rule:

    • Gradient w.r.t.w.r.t. ww: Cw=1mi=1m(y^iyi)xi\frac{\partial C}{\partial w} = \frac{1}{m} \sum_{i=1}^{m} (\hat{y}_i - y_i) x_i
    • Gradient w.r.t.w.r.t. bb: Cb=1mi=1m(y^iyi)\frac{\partial C}{\partial b} = \frac{1}{m} \sum_{i=1}^{m} (\hat{y}_i - y_i)

Final Update Rules

Weight update:

w=wα1mi=1m(y^iyi)xiw = w - \alpha \cdot \frac{1}{m} \sum_{i=1}^{m} (\hat{y}_i - y_i) x_i

Bias update:

b=bα1mi=1m(y^iyi)b = b - \alpha \cdot \frac{1}{m} \sum_{i=1}^{m} (\hat{y}_i - y_i)

Linear Regression In Practice

When to Use Linear Regression

When Not to Use Linear Regression

Practical Notes

Preprocessing and Diagnostics

Regularization and Features

← Back to Blog