IOAI ML Notes Classical Machine Learning

Bias-Variance Decomposition

Notes on bias-variance decomposition for regression and classification based on mlxtend.

· updated 6 February 2026

Syllabus Map


Overview


Definitions (Expectation Over Training Sets)


Squared Loss Decomposition (Regression)

E[(yy^)2]=(yE[y^])2+E[(y^E[y^])2]E[(y - \hat{y})^2] = \big(y - E[\hat{y}]\big)^2 + E\big[(\hat{y} - E[\hat{y}])^2\big]

Total Error and Irreducible Error

Why Total Error = Bias^2 + Variance + Irreducible Error

Notation:

Assume the data-generating process:

y=f(x)+ε,E[ε]=0,Var(ε)=σ2y = f(x) + \varepsilon, \quad E[\varepsilon] = 0, \quad Var(\varepsilon) = \sigma^2

Then:

E[(yy^)2]=E[(f(x)+εy^)2]E[(y - \hat{y})^2] = E[(f(x) + \varepsilon - \hat{y})^2]

Expand the expectation:

E[(yy^)2]=E[(f(x)y^)2]+E[ε2]E[(y - \hat{y})^2] = E[(f(x) - \hat{y})^2] + E[\varepsilon^2]

And decompose the first term:

E[(f(x)y^)2]=(f(x)E[y^])2+E[(y^E[y^])2]E[(f(x) - \hat{y})^2] = (f(x) - E[\hat{y}])^2 + E[(\hat{y} - E[\hat{y}])^2]

So:

Therefore:

E[(yy^)2]=Bias2+Variance+Irreducible ErrorE[(y - \hat{y})^2] = \text{Bias}^2 + \text{Variance} + \text{Irreducible Error}

0-1 Loss Decomposition (Classification)


Practical Notes

Variance Reduction

0-1 Loss Caveat

← Back to Blog