IOAI ML Notes Natural Language Processing

Pre-trained Text Encoders

Transformer-based text encoders and their use cases.

Syllabus Map


Overview


Core Objectives

Masked Language Modeling (MLM)

LMLM=iMlogpθ(xix\M)\mathcal{L}_{\text{MLM}}=-\sum_{i\in \mathcal{M}}\log p_\theta(x_i \mid x_{\backslash \mathcal{M}})

Contrastive Sentence Objectives


Example Models


Common Uses

Feature Extraction

Fine-Tuning


Pooling Strategies


Step-by-Step Usage

Step 1: Choose encoder size

Step 2: Tokenize and truncate

Step 3: Start with linear probe

Step 4: Fine-tune if needed

Step 5: Evaluate robustness


Practical Notes

Use domain-adaptive pretraining when domain mismatch is large

Normalize embeddings for retrieval with cosine similarity

Reduce serving cost with quantization and distillation

← Back to Blog