Extraction Propagation
Abstract
Running backpropagation end to end on large neural networks is fraught with difficulties like vanishing gradients and degradation. In this paper we present an alternative architecture composed of many small neural networks that interact with one another. Instead of propagating gradients back through the architecture we propagate vector-valued messages computed via forward passes, which are then used to update the parameters. Currently the performance is conjectured as we are yet to implement the architecture. However, we do back it up with some theory. A previous version of this paper was entitled "Fusion encoder networks" and detailed a slightly different architecture.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.