As we progress through the layers of the network, the analysis and synthesis dictionaries look more Gabor-like and converge closer to the final dictionary. Interestingly, the very first few layers of the network also show Gabor-like structures, in contrast to the more "noisy" filters in the intermediate layers.
Further, we do not observe a significant difference between the dictionaries of CDLNet and CDLNet-B, suggesting that the generalization capability of CDLNet is solely a result of the noise-adaptive thresholds and not the learned intermediate representations.
How do the learned thresholds of CDLNet change over layers and subbands? For the adaptive model, we have an affine relationship with the input noise-level (). For visualization purposes, we look at the thresholds for an input noise-level of . We also show the thresholds of an equivalent model trained without adaptive thresholds (CDLNet-B).
Note that the colorbars are not matched between the above two figures. For both adaptive and non-adaptive models, we see a general trend of thresholds increasing towards the final layers.
How do the sparse codes of an input image vary over layers? Below we show the magnitude of the sparse codes (in layer ) for the cameraman
test image, for an input noise-level of .
We observe that the representation becomes sparse in the final layers of the network, corresponding with the higher learned thresholds. Note that sparsity is not explicitly asked for during training (there is no sparsity penalty in the loss function), but rather it is encoded through the use of the shrinkage-thresholding non-linearity (derived from the basis-pursuit denoising formulation of the network).