Want to create interactive content? Itβs easy in Genially!
GAN-Leaks
Adriana Watson
Created on October 8, 2024
Start designing with a free template
Discover more than 1500 professional designs like these:
Transcript
GAN Leaks: A Taxonomy of Membership Inference Attacks against Generative Models
Presented by: Adriana Watson
Mario Fritz
CISPA Helmholtz Center for Information Security
Ning Yu
University of Maryland, College Park Max Planck Institute for Informatics
Yang Zhang
CISPA Helmholtz Center for Information Security
Dingfan Chen
CISPA Helmholtz Center for Information Security
CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security
Published in Arxiv: Sep 2019Published in ACM: Nov 2020
Introduction
GANs and VAEs
01
To adress GAN mode collapse and low VAE precision, hybrid models such as VAEGAN have become popular.
GANs and VAEs
Generative Adversarial Network
Variational Autoencoder
The Privacy Issue
MIA: Membership Inference Attacks Can we confirm that an image was used in the training data? If a model memorizes its training data, an attacker can exploit this by testing how closely the model recreates a known input.
Research Objectives
The Objectives
Challenge: While MIAs are well-researched for discriminative models, little attention has been given to generative models. Goal: Create a taxonomy to categorize different types of MIAs that apply to generative models and explore generic attack models applicable to various settings. Results: The authors develop both a taxonomy of attacks and a defense strategy to mitigate these privacy breaches.
Taxonomy of Attacks
The Progression of Attacks
02
Attack Modes
Full Black-Box Generator Only generated outputs are visible, new inputs can not be provided Attacker compares outputs to target
Partial Black-Box Generator Attacker has access to latent code of generated samples Attacker can adjust inputs to refine the guess
White-Box Generator Generator is published to allow new samples to be created Attackers can provide inputs and analyze impact
Accesible Discriminator (Full Model) Full model with source code is published for fine-tuning. Attacker can direcly analyze the model to see the data used
Attack Model
Generic Model and Attack Calibration
03
Attack Models
Full Black-Box Generator The attacker collects a set of k samples and selects the nearest neighbor
Partial Black-Box Generator The attacker optimizes the latent code (z) to minimize the distance between the generated sample and query
White-Box Generator Same approach as partial black-box attack, however access to gradients allows first-order optimization methods to be used.
Generic Attack Model Membership probability is assumbed to be β to the probability that the generative model can generate the query sample
Attack Calibration
The Problem
Some data points are inherently difficult to reconstruct, making it hard to infer membership correctly. The authors introduce a reference model trained on unrelated data to compare reconstruction difficulty.
Theorem 5.1. Given the victim model with parameter ππ£ , a query dataset π, the membership probability of a query sample π₯π is well approximated by the sigmoid of minus calibrated reconstruction error.
Experimental Results
Experimental setup, analysis, and evaluation
04
Experimental Setup
Datasets
- CelebA: 200k RGB images (20k used for training) of celebrity faces
- MIMIC-III: 56,520 public Electronic Health Records (41,307 samples used) of ICU patient data formatted as 1071-dimensional binary feature vectors
- Instagram New-York: 34,336 samples of Instagram location check-ins, 4048-dimensional vector with time, longitude, and latitude data
Victim GAN Models
- PGGAN
- WGANGP
- DC-GAN
- MEDGAN
- VAEGAN
Black and White Box Attack Performance
Comparison of Attack Modes
Performance Gain from Calibration
Comparison to Baseline Attacks
Defense Strategies
Existing Defenses Against MIAs
- Differential Privacy (DP): Introduce random noise to the training process to prevent overfitting.
- Defensive Distillation: Trains a simpler model on outputs from a larger model (knowledge distillation) to remove fine details
Discussion
Strengths and Weaknesses
05
Influence
Influence
- ~402 citations
- Many citations related to research in ML applications for medicine
Hu, H., Salcic, Z., Sun, L., Dobbie, G., Yu, P. S., & Zhang, X. (2022). Membership inference attacks on machine learning: A survey. ACM Computing Surveys (CSUR), 54(11s), 1-37.
Comparison to other studies
Lu, Y., Shen, M., Wang, H., Wang, X., van Rechem, C., & Wei, W. (2023). Machine learning for synthetic data generation: a review. arXiv preprint arXiv:2302.04062.
[1] Hu, H., Salcic, Z., Sun, L., Dobbie, G., Yu, P. S., & Zhang, X. (2022). Membership inference attacks on machine learning: A survey. ACM Computing Surveys (CSUR), 54(11s), 1-37.
Strengths & Weaknesses
Strengths
- Comprehensive attack taxonomy
- Flexible generic attack model
- Presentation of attack calibration is novel
- Dataset choice is unique and relevent
Weaknesses
- Paper doesn't address ongoing defense challenges extensively
- Limited performance for white-box models
- Some unrealistic assumption (white-box model) are made
- Only AUCROC is used as a metric measurement which has been found to be "sensitive to the probability rank of members," [1]