ml -2 cw -2
Sonali Nandagopalan
Created on April 30, 2023
More creations to inspire you
Transcript
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Authors:Tero Karras, NVIDIASamuli Laine, NVIDIAMiika Aittala, NVIDIAJanne Hellsten, NVIDIAJaakko Lehtinen, NVIDIA and Aalto UniversityTimo Aila, NVIDIA
Analyzing and Improving the Image Quality of StyleGAN
Index
References
Conclusion
Project of Latent Space to Images
Resolution Usage
Phase Artifact
Image quality and generator smoothness
Droplet Artifact
StyleGAN
GAN
Aim
Aim
"Fixing characteristic artifacts in StyleGAN and improving the quality further."
70%
01
Generative Adversial Networks
- Data driven unconditional generative modeling
- Unconventional generator architecture
- Incorporates stochastic variation
- Generates high-resolution images
02
StyleGAN (2019)
Visual Anomalies
Distortion
Bluriness
"But the SOTA has visible characteristic artifacts"
03
Discriminator should have been able to detect the "Droplet Effect"
70%
04
Droplet Artifact
Issue - 1
+ info
Why?
Hypothesis: Removal of Normalisation will remove the droplet artifact
Destroying information in the magnitude of features
Problem: AdaIN Operation
Normalises Mean and Variance
05
Generator sneaks signal strength information
Problem Description
Removing normalisation impacts scale-specific controls
06
Solution
Remove the effect of s: Aim of Instance Normalisation
Removes droplet effect and retains full controllability
70%
07
Droplet Artifact
Issue - 1
+ info
Weight demodulation
11
StyleGAN2 Improvement
Image quality and generator smoothness
11
PPL is inversely proportional to the Image Quality
- Generator stretches the region of latent space to penalise broken images and improve the quality of images.
- We cannot encourage low PPL as it would return a zero recall.
What conclusions can we draw from the relationship?
70%
12
Issue - 2
Path Length Regularisation
Adapts the concept of "lazy regulariser". Uses the concepts of Jacobian matrix and Euclidean Norm to provide more effective generator models
Lazy Regularisation
Regularisation terms are computed less frequently. that the main loss function. Lowers computational cost
13
The Fix
70%
14
Phase Artifact
Issue - 3
+ info
Rises the need for change in Arhcitecture!
- Resolution of images are gradually increased at each step.
- At each step, high frequency details are generated for the current resolution, which helps capture fine grain details.
- This leads to the trained network having high frequency details.
- The network becomes sensitive to small changes.
- This compromises network shift invariance, which is the network's ability to identify shift in location or position.
15
Progressive Growing
What is the potential cause?
16
Using "skip" generator and a "residual" discriminator without progressive growing
Progressive Growing
Comparison of Architectures
17
Without Progressive Growing
StyleGAN2 Improvement
Works in line with "progressive growing" and the hypothesised "expectation"
The cause of the problem: Capacity problem
The architectures discussed allow it
Start of the training the learning is similar to progressive growing. But at the end of training high resolution fails to dominate.
INFO
Solution: Doubling the number of feature maps.
Generator has to focus on "low-resolution" features and then slowly start focussing on finer details (key aspect of progressive growing).
18
Resolution Usage
Detection of the generated image back to the source is important
- Use ramped-down noise during optimization to explore the latent space thoroughly.
- Optimize the stochastic noise inputs of StyleGAN generator.
- Regulate the stochastic noise inputs to avoid carrying coherent signals.
19
Projection of Images to Latent Space
- StyleGAN2 addressed image quality issues in StyleGAN and improved image quality in various datasets, especially in motion.
- It's easier to attribute a generated image to its source using StyleGAN2.
- Training performance was improved, with faster training time and energy usage comparable to the original StyleGAN.
- The project consumed a significant amount of electricity and computation resources.
- Future improvements in path length regularization could be explored, and finding ways to reduce training data requirements is important for practical deployment of GANs, especially with datasets with intrinsic variation.
Conclusions
- Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2019), Analyzing and improving the image quality of StyleGAN, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8110-8119).
- Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2020). Supplemental Material: Analyzing and Improving the Image Quality of StyleGAN. NVIDIA
- https://github.com/NVlabs/stylegan2
70%
21
References
" Thank you!"