ml -2 cw -2

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Authors:Tero Karras, NVIDIASamuli Laine, NVIDIAMiika Aittala, NVIDIAJanne Hellsten, NVIDIAJaakko Lehtinen, NVIDIA and Aalto UniversityTimo Aila, NVIDIA

Analyzing and Improving the Image Quality of StyleGAN

Index

References

Conclusion

Project of Latent Space to Images

Resolution Usage

Phase Artifact

Image quality and generator smoothness

Droplet Artifact

StyleGAN

GAN

Aim

"Fixing characteristic artifacts in StyleGAN and improving the quality further."

70%

Generative Adversial Networks

Data driven unconditional generative modeling
Unconventional generator architecture
Incorporates stochastic variation
Generates high-resolution images

StyleGAN (2019)

Visual Anomalies

Distortion

Bluriness

"But the SOTA has visible characteristic artifacts"

Discriminator should have been able to detect the "Droplet Effect"

70%

Droplet Artifact

Issue - 1

+ info

Why?

Hypothesis: Removal of Normalisation will remove the droplet artifact

Destroying information in the magnitude of features

Problem: AdaIN Operation

Normalises Mean and Variance

Generator sneaks signal strength information

Problem Description

Removing normalisation impacts scale-specific controls

Solution

Remove the effect of s: Aim of Instance Normalisation

Removes droplet effect and retains full controllability

70%

Droplet Artifact

Issue - 1

+ info

Weight demodulation

StyleGAN2 Improvement

Image quality and generator smoothness

PPL is inversely proportional to the Image Quality

Generator stretches the region of latent space to penalise broken images and improve the quality of images.
We cannot encourage low PPL as it would return a zero recall.

What conclusions can we draw from the relationship?

70%

Issue - 2

Path Length Regularisation

Adapts the concept of "lazy regulariser". Uses the concepts of Jacobian matrix and Euclidean Norm to provide more effective generator models

Lazy Regularisation

Regularisation terms are computed less frequently. that the main loss function. Lowers computational cost

The Fix

70%

Phase Artifact

Issue - 3

+ info

Rises the need for change in Arhcitecture!

Resolution of images are gradually increased at each step.
At each step, high frequency details are generated for the current resolution, which helps capture fine grain details.
This leads to the trained network having high frequency details.
The network becomes sensitive to small changes.
This compromises network shift invariance, which is the network's ability to identify shift in location or position.

Progressive Growing

What is the potential cause?

Using "skip" generator and a "residual" discriminator without progressive growing

Progressive Growing

Comparison of Architectures

Without Progressive Growing

StyleGAN2 Improvement

Works in line with "progressive growing" and the hypothesised "expectation"

The cause of the problem: Capacity problem

The architectures discussed allow it

Start of the training the learning is similar to progressive growing. But at the end of training high resolution fails to dominate.

INFO

Solution: Doubling the number of feature maps.

Generator has to focus on "low-resolution" features and then slowly start focussing on finer details (key aspect of progressive growing).

Resolution Usage

Detection of the generated image back to the source is important

Use ramped-down noise during optimization to explore the latent space thoroughly.
Optimize the stochastic noise inputs of StyleGAN generator.
Regulate the stochastic noise inputs to avoid carrying coherent signals.

Projection of Images to Latent Space

StyleGAN2 addressed image quality issues in StyleGAN and improved image quality in various datasets, especially in motion.
It's easier to attribute a generated image to its source using StyleGAN2.
Training performance was improved, with faster training time and energy usage comparable to the original StyleGAN.
The project consumed a significant amount of electricity and computation resources.
Future improvements in path length regularization could be explored, and finding ways to reduce training data requirements is important for practical deployment of GANs, especially with datasets with intrinsic variation.

Conclusions

Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2019), Analyzing and improving the image quality of StyleGAN, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8110-8119).
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2020). Supplemental Material: Analyzing and Improving the Image Quality of StyleGAN. NVIDIA
https://github.com/NVlabs/stylegan2

70%

References

" Thank you!"

ml -2 cw -2

More creations to inspire you

Transcript

Aim

Discriminator should have been able to detect the "Droplet Effect"

Why?

Removing normalisation impacts scale-specific controls