Mathematics of Back Propagation for Deep Learning simplified
Back Propagation Algorithm a short history Although without reference to neural networks back propagation method was perhaps first described in a 1970 master's thesis by Seppo Linnainmaa. In 1973 Stuart E. Dreyfus in published a simpler derivative based method to minimise cost function by adapting parameters of controllers in proportion to error gradients. The first neural network specific application of BPA was proposed by Paul Werbos in 1982. This was later refined by Rumelhart, McClelland and Hinton in 1986 and made a radical change in solving supervised learning problems. Their research paper experimentally demonstated the usefulness of BPA for internal representations in the hidden layers and easy addressing of its synapses of Multi Layered Feed Forward Neural Networks ( MLFFNN ) . This paper significantly contributed to the popularisation of BPA. Description of BPA BP uses ordered derivatives (as termed by Werbos - The Roots of Backpropagation: From Ordered Derivatives...