IOAI ML Notes Neural NetworkDeep Learning

Pooling, Batch Norm, and Layer Norm

Pooling plus batch and layer normalisation fundamentals.

Syllabus Map


Overview


Pooling

Core idea

Common types

How it works (2D)

H_{out} = \left\lfloour \frac{H + 2P - K}{S} \right\rfloour + 1,\quad W_{out} = \left\lfloour \frac{W + 2P - K}{S} \right\rfloour + 1

Gradient flow

Design knobs

Practical Notes

Use pooling cautiously for localization-heavy tasks

Consider strided convolutions as learnable alternatives

Prefer global average pooling before classifiers

Avoid aggressive early downsampling for small objects

PyTorch examples

import torch.nn as nn

max_pool = nn.MaxPool2d(kernel_size=2, stride=2)
avg_pool = nn.AvgPool2d(kernel_size=2, stride=2)
global_avg = nn.AdaptiveAvgPool2d((1, 1))

Batch Normalisation

Core idea

How it works

Practical Notes

Handle train/eval mode correctly

Mitigate small-batch instability

Use standard layer ordering

Watch BN momentum and inference behavior

PyTorch examples

import torch.nn as nn

bn1 = nn.BatchNorm1d(num_features=128)
bn2 = nn.BatchNorm2d(num_features=64)
bn3 = nn.BatchNorm3d(num_features=32)

# Typical conv block
block = nn.Sequential(
    nn.Conv2d(64, 64, kernel_size=3, padding=1, bias=False),
    nn.BatchNorm2d(64),
    nn.ReLU(inplace=True)
)

Layer Normalisation

Core idea

How it works

Practical Notes

Prefer for small-batch or sequence-heavy workloads

Leverage consistent train/eval behavior

Use proven placement patterns

PyTorch examples

import torch.nn as nn

ln = nn.LayerNorm(normalized_shape=512)

block = nn.Sequential(
    nn.Linear(512, 512, bias=False),
    nn.LayerNorm(512),
    nn.ReLU(inplace=True)
)

Layer Norm vs Batch Norm

Key differences

Rule of thumb

← Back to Blog