We present InvertibleNetworks.jl, an open-source package for invertible neural networks and normalizing flows using memory-efficient backpropagation. InvertibleNetworks.jl uses manually implement gradients to take advantage of the invertibility of building blocks, which allows for scaling to large-scale problem sizes. We present the architecture and features of the library and demonstrate its application to a variety of problems ranging from loop unrolling to uncertainty quantification.
Invertible neural networks (INNs) are designed around bijective building blocks that allow the evaluation of (deep) INNs in both directions, which means that inputs into the network (and all internal states) can be uniquely re-computed from the output. INNs were popularized in the context of normalizing flows as an alternative approach to generative adversarial networks (GANs) and variational auto-encoders (VAEs), but their property of invertibility is also appealing for discriminative models, as INNs allow memory-efficient backpropagation during training. As hidden states can be recomputed for INNs from the output, it is in principle not required to save the state during forward evaluation, thus leading to a significantly lower memory imprint than conventional neural networks. However, existing backpropagation libraries that are used in TensorFlow or PyTorch do not support the concept of invertibility and therefore require work arounds to benefit from them. For this reason, current frameworks for INNs such as FrEIA or MemCNN use layer-wise AD, in which backpropagation is performed by first re-computing the hidden state of the current layer and then using PyTorch's AD tool (Autograd) to compute the gradients for the respective layer. This approach is computationally not efficient, as it performs an additional forward pass during backpropagation.
With InvertibleNetworks.jl, we present an open-source Julia framework (MIT license) with manually implemented gradients, in which we take advantage of the invertibility of building blocks. For each invertible layer, we provide a backpropagation layer that (re-)computes the hidden state and weight updates all at once, thus not requiring an extra (layer-wise) forward evaluation. In addition to gradients, InvertibleNetworks.jl also provides Jacobians for each layer (i.e. forward differentiation), or more precisely, matrix-free implementations of Jacobian-vector products, as well as log-determinants for normalizing flows. While backpropagation and Jacobians are implemented manually, InvertibleNetworks.jl integrates seamlessly with ChainRules.jl, so users do not need to manually define backward passes for implemented networks. Additionally, InvertibleNetworks.jl is compatible with Flux.jl, so that users can create networks that consist of a mix of invertible and non-invertible Flux layers. In this talk, we present the architecture and features of InvertibleNetworks.jl, which includes implementations of common invertible layers from the literature, and show its application to a range of scenarios including loop-unrolled imaging, uncertainty quantification with normalizing flows and large-scale image segmentation.