IOAI ML Notes Natural Language Processing

Machine Translation

Machine translation with encoder-decoder models, training, decoding, and evaluation.

Syllabus Map


Overview

P(y1:Tx1:T)=t=1TP(yty<t,x1:T)P(y_{1:T'} \mid x_{1:T})=\prod_{t=1}^{T'}P(y_t\mid y_{<t},x_{1:T})

Encoder-Decoder MT

Core Idea

LMT=t=1TlogPθ(yty<t,x)\mathcal{L}_{\text{MT}}=-\sum_{t=1}^{T'}\log P_\theta(y_t\mid y_{<t},x)

Practical Notes

Requires parallel data

Uses shared subword vocabularies

Sentence alignment quality matters


Linguistic Divergences

Core Idea

Practical Notes

Word-order typology

Lexical divergence and WSD

Morphology and reference


Evaluation Metrics

Adequacy and Fluency

Automatic Metrics


Decoding Strategies (MT)

Greedy Decoding

Minimum Bayes Risk (MBR)


Training Pipeline (Practical)

Step 1: Build and align bitext

Step 2: Tokenize and batch

Step 3: Train encoder-decoder

Step 4: Expand data coverage


Practical Notes

Data quality dominates

Low-resource gains come from augmentation

Evaluate with multiple lenses

CAT workflows remain important

← Back to Blog