Skip to content
UoL CS Notes

Multi-Layer Perceptrons

ELEC320 Lectures

Backpropagation

To update the weights for a multi-layer perceptron we need to complete the following steps:

  1. Feed Forward
  2. Backwards Pass

The general formula to update the weights on a multi-layer perceptron is:

\[\mathbf w_{(n+1)}=\mathbf w_n-\eta\nabla_\mathbf wE\]

where:

  • $\nabla_\mathbf wE$ is the gradient with respect to $E$.
  • $\eta$ is the learning rate.

We also have the following functions available:

  • Output error for the $j^{\text{th}}$ neuron:

    \[e_j(n)=d_j(n)-y_i(n)\]
  • Output network error:

    \[E(n)=\frac12\sum_{j\in\text{output}}e^2_j(n)\]
  • Average network error:

    \[E_\text{avg}=\frac1N\sum^N_{n=1}E(n)\]
  • Induced local field (ILF):

    \[v_j(n)=\text{sum weights and measures}\]
  • Neuron output:

    \[y_j(n)=\phi(v_j(n))\]
  • Local gradient (for one node):

    \[\delta_j(n)=e_j(n)\times\underbrace{\phi'(v_j(n))}_\text{activation function gradient}\]
  • Weight update function:

    \[\nabla_\mathbf WE=-\delta_j(n)y_i(n)\]

Backwards Pass

Consider the following key equations:

A (output neuron):

\[\delta_j(n)=e_j(n)\phi'(v_j(n))\]

B (hidden neuron):

\[\delta_j(n)=\phi'_j\left(v_j\left(n\right)\right)\sum_n\left(\delta_k(n)w_{kj}(n)\right)\]

1 (weight difference):

\[\begin{aligned} \Delta w_{ji}(n)&=-\eta\frac{\partial E(n)}{\partial w_{ji}(n)}\\ &=\eta\delta_j(n)y_i(n) \end{aligned}\]

and follow the steps:

  1. Calculate the local gradients by using A and B for the output and hidden layers.
  2. Use equation 1 to update the gradients connecting the $j^\text{th}$ and $k^\text{th}$ neurons.