RNN Explorer

Step-by-step animations of how Recurrent Neural Networks process sequences

01 Forward Pass - Step by Step

What happens here

An RNN processes a sequence one element at a time. At each time step the cell receives two things: the current input vector $x_t$ (e.g. a word embedding) and the hidden state $h_{t-1}$ from the previous step. It produces a new hidden state $h_t$ and optionally an output $y_t$. The same cell (same weights) is reused at every step - we just "unroll" it to visualize the full sequence. Watch the animation: each sub-phase is highlighted so you can follow exactly what moves where.

Input vector $x_t$

Hidden state $h_t$

Output $y_t$

Inactive cell

Active cell

Speed

Step: 0 / 5

Press Play or Step to begin the forward pass.

02 Architecture Variants

How the outputs are used

The same RNN cell can be wired differently depending on the task. Select a variant below to see how inputs and outputs are connected:

Many-to-One: full sequence in, single output. The final hidden state feeds a fully-connected network with non-linear activations (e.g. ReLU) that acts as the classifier (e.g. sentiment classification).
Many-to-Many (same length): one output per time step (e.g. POS tagging).
Many-to-Many (Encoder-Decoder): encode the input sequence, then decode into a new sequence (e.g. translation).
One-to-Many: single input, generate a sequence (e.g. image captioning).

Select an architecture and press Play to see it in action.

03 Backpropagation Through Time - Gradient Flow

What to look for

During training, the gradient flows backwards through the unrolled cells. At each step the gradient is multiplied by $W_{hh}$ (the recurrent weight matrix). If $|W_{hh}| < 1$, the gradient shrinks at every step and eventually vanishes. If $|W_{hh}| > 1$, it grows and can explode. Watch the red bar at each cell: it shows the relative gradient magnitude arriving at that step.

Large gradient

Small gradient

Near-zero gradient

|W_hh| 0.70

Steps 8

Backprop step: 0 / 8

Press Play to watch the gradient propagate backwards from the loss.

INSIGHT Adjust |W_hh| and press Play to observe vanishing vs. exploding gradients.