IOAI ML Notes Classical Machine LearningUnsupervised Learning

t-SNE & UMAP

A concise guide to t-SNE and UMAP for nonlinear dimensionality reduction.

Syllabus Map


Overview


t-SNE (T-distributed Stochastic Neighbour Embedding)

Core Idea

How It Works (Step-by-Step)

Step 1: Compute High-Dimensional Similarities

Step 2: Symmetrise the Probabilities

Step 3: Define Low-Dimensional Similarities

Step 4: Optimise the Embedding

Practical Notes

Hyperparameter Sensitivity

Primary Use Case

Limitations


UMAP (Uniform Manifold Approximation and Projection)

Core Idea

How It Works (Step-by-Step)

Step 1: Build the k-NN Graph

Step 2: Compute Fuzzy Membership Strengths

Step 3: Construct the Low-Dimensional Fuzzy Graph

Step 4: Optimise by Matching Graphs

Intuition

Key Hyperparameters

Practical Notes

Runtime on Large Datasets

Global Structure Preservation

Supervised and Semi-Supervised UMAP

Fit/Transform Workflow

Axis Interpretability


PCA vs t-SNE vs UMAP

Quick Comparison

MethodTypeWhat It Preserves BestSpeedTypical Use
PCALinear projectionGlobal variance structureFastestCompression, preprocessing, baseline visualisation
t-SNENonlinear embeddingLocal neighbourhoodsSlowestVisual cluster inspection
UMAPNonlinear manifold/graph embeddingLocal structure + some global geometryMediumVisualisation and sometimes downstream features

Key Takeaways

← Back to Blog