Multi-Layer Perceptrons

Backpropagation

To update the weights for a multi-layer perceptron we need to complete the following steps:

The general formula to update the weights on a multi-layer perceptron is:

\[\mathbf w_{(n+1)}=\mathbf w_n-\eta\nabla_\mathbf wE\]

where:

We also have the following functions available:

Output error for the $j^{\text{th}}$ neuron:
\[e_j(n)=d_j(n)-y_i(n)\]
Output network error:
\[E(n)=\frac12\sum_{j\in\text{output}}e^2_j(n)\]
Average network error:
\[E_\text{avg}=\frac1N\sum^N_{n=1}E(n)\]
Induced local field (ILF):
\[v_j(n)=\text{sum weights and measures}\]
Neuron output:
\[y_j(n)=\phi(v_j(n))\]
Local gradient (for one node):
\[\delta_j(n)=e_j(n)\times\underbrace{\phi'(v_j(n))}_\text{activation function gradient}\]
Weight update function:
\[\nabla_\mathbf WE=-\delta_j(n)y_i(n)\]

Consider the following key equations:

A (output neuron):

\[\delta_j(n)=e_j(n)\phi'(v_j(n))\]

B (hidden neuron):

\[\delta_j(n)=\phi'_j\left(v_j\left(n\right)\right)\sum_n\left(\delta_k(n)w_{kj}(n)\right)\]

1 (weight difference):

\[\begin{aligned} \Delta w_{ji}(n)&=-\eta\frac{\partial E(n)}{\partial w_{ji}(n)}\\ &=\eta\delta_j(n)y_i(n) \end{aligned}\]

and follow the steps:

Calculate the local gradients by using A and B for the output and hidden layers.
Use equation 1 to update the gradients connecting the $j^\text{th}$ and $k^\text{th}$ neurons.