ECOLE overview
Our model ECOLE is a deep neural network model which uses a variant of the transformer architecture20 at its core. The transformer is a parallelizable encoder-decoder model that receives an input and applies alternating layers of multi-headed self-attention, multi-layer perceptron (MLP), and layer normalization layers to it. Transformer architecture has achieved state-of-the-art results in signal processing over recurrent neural networks in the natural language processing domain20 as well as recently over the convolutional-based models in the computer vision domain23.
Figure
Continue Reading
News Source: www.nature.com