IOAI ML Notes Neural NetworksRNNLSTMGRU

RNNs, LSTMs and GRUs

Overview of RNNs, LSTMs and GRUs covering RNNs, LSTMs, and GRUs and Outline.

· updated 4 February 2026

RNNs, LSTMs, and GRUs

Syllabus Map


Outline

RNNs

Core idea (high level)

Notation

How it works (specific)

fθ:(xt,at1)(yt,at)f_\theta: (x_t, a_{t-1}) \mapsto (y_t, a_t)

1. Set an initial hidden state a0a_0

2. Calculate the hidden state ata_t

at=ϕ(Waat1+Wxxt+ba)a_t = \phi(W_a a_{t-1} + W_x x_t + b_a)

3. Calculate the output for this timestep yty_t

yt=Wyat+byy_t = W_y a_t + b_y

4. Pass the current hidden state ata_t to the next timestep t+1t+1

Practical usage

LSTMs

Core idea (high level)

Notation (gates and memory)

How it works (specific)

1. Set the initial states h0h_0 and C0C_0

2. Compute the forget gate ftf_t

ft=σ(Wf[ht1,xt]+bf)f_t = \sigma(W_f [h_{t-1}, x_t] + b_f)

3. Compute the input gate iti_t and candidate memory C~t\tilde{C}_t

it=σ(Wi[ht1,xt]+bi)i_t = \sigma(W_i [h_{t-1}, x_t] + b_i) C~t=tanh(WC[ht1,xt]+bC)\tilde{C}_t = \tanh(W_C [h_{t-1}, x_t] + b_C)

4. Update the cell state CtC_t

Ct=ftCt1+itC~tC_t = f_t \odot C_{t-1} + i_t \odot \tilde{C}_t

5. Compute the output gate oto_t

ot=σ(Wo[ht1,xt]+bo)o_t = \sigma(W_o [h_{t-1}, x_t] + b_o)

6. Compute the hidden state hth_t

ht=ottanh(Ct)h_t = o_t \odot \tanh(C_t)

Practical usage

GRUs

Core idea (high level)

Notation (gates and memory)

How it works (specific)

1. Set the initial hidden state h0h_0

2. Compute the update gate ztz_t

zt=σ(Wz[ht1,xt]+bz)z_t = \sigma(W_z [h_{t-1}, x_t] + b_z)

3. Compute the reset gate rtr_t

rt=σ(Wr[ht1,xt]+br)r_t = \sigma(W_r [h_{t-1}, x_t] + b_r)

4. Compute the candidate hidden state h~t\tilde{h}_t

h~t=tanh(Wh[rtht1,xt]+bh)\tilde{h}_t = \tanh(W_h [r_t \odot h_{t-1}, x_t] + b_h)

5. Update the hidden state hth_t

ht=(1zt)h~t+ztht1h_t = (1 - z_t) \odot \tilde{h}_t + z_t \odot h_{t-1}

Practical usage

← Back to Blog