Want to create interactive content? It’s easy in Genially!

Get started free

GAN-Leaks

Adriana Watson

Created on October 8, 2024

Start designing with a free template

Discover more than 1500 professional designs like these:

Transcript

GAN Leaks: A Taxonomy of Membership Inference Attacks against Generative Models

Presented by: Adriana Watson

Mario Fritz

CISPA Helmholtz Center for Information Security

Ning Yu

University of Maryland, College Park Max Planck Institute for Informatics

Yang Zhang

CISPA Helmholtz Center for Information Security

Dingfan Chen

CISPA Helmholtz Center for Information Security

CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security

Published in Arxiv: Sep 2019Published in ACM: Nov 2020

Introduction

GANs and VAEs

01

To adress GAN mode collapse and low VAE precision, hybrid models such as VAEGAN have become popular.

GANs and VAEs

Generative Adversarial Network

Variational Autoencoder

The Privacy Issue

MIA: Membership Inference Attacks Can we confirm that an image was used in the training data? If a model memorizes its training data, an attacker can exploit this by testing how closely the model recreates a known input.

Research Objectives

The Objectives

Challenge: While MIAs are well-researched for discriminative models, little attention has been given to generative models. Goal: Create a taxonomy to categorize different types of MIAs that apply to generative models and explore generic attack models applicable to various settings. Results: The authors develop both a taxonomy of attacks and a defense strategy to mitigate these privacy breaches.

Taxonomy of Attacks

The Progression of Attacks

02

Attack Modes

Full Black-Box Generator Only generated outputs are visible, new inputs can not be provided Attacker compares outputs to target

Partial Black-Box Generator Attacker has access to latent code of generated samples Attacker can adjust inputs to refine the guess

White-Box Generator Generator is published to allow new samples to be created Attackers can provide inputs and analyze impact

Accesible Discriminator (Full Model) Full model with source code is published for fine-tuning. Attacker can direcly analyze the model to see the data used

Attack Model

Generic Model and Attack Calibration

03

Attack Models

Full Black-Box Generator The attacker collects a set of k samples and selects the nearest neighbor

Partial Black-Box Generator The attacker optimizes the latent code (z) to minimize the distance between the generated sample and query

White-Box Generator Same approach as partial black-box attack, however access to gradients allows first-order optimization methods to be used.

Generic Attack Model Membership probability is assumbed to be ∝ to the probability that the generative model can generate the query sample

Attack Calibration

The Problem

Some data points are inherently difficult to reconstruct, making it hard to infer membership correctly. The authors introduce a reference model trained on unrelated data to compare reconstruction difficulty.

Theorem 5.1. Given the victim model with parameter πœƒπ‘£ , a query dataset 𝑆, the membership probability of a query sample π‘₯𝑖 is well approximated by the sigmoid of minus calibrated reconstruction error.

Experimental Results

Experimental setup, analysis, and evaluation

04

Experimental Setup

Datasets

  • CelebA: 200k RGB images (20k used for training) of celebrity faces
  • MIMIC-III: 56,520 public Electronic Health Records (41,307 samples used) of ICU patient data formatted as 1071-dimensional binary feature vectors
  • Instagram New-York: 34,336 samples of Instagram location check-ins, 4048-dimensional vector with time, longitude, and latitude data

Victim GAN Models

  • PGGAN
  • WGANGP
  • DC-GAN
  • MEDGAN
  • VAEGAN

Black and White Box Attack Performance

Comparison of Attack Modes

Performance Gain from Calibration

Comparison to Baseline Attacks

Defense Strategies

Existing Defenses Against MIAs

  • Differential Privacy (DP): Introduce random noise to the training process to prevent overfitting.
  • Defensive Distillation: Trains a simpler model on outputs from a larger model (knowledge distillation) to remove fine details

Discussion

Strengths and Weaknesses

05

Influence

Influence

  • ~402 citations
  • Many citations related to research in ML applications for medicine

Hu, H., Salcic, Z., Sun, L., Dobbie, G., Yu, P. S., & Zhang, X. (2022). Membership inference attacks on machine learning: A survey. ACM Computing Surveys (CSUR), 54(11s), 1-37.

Comparison to other studies

Lu, Y., Shen, M., Wang, H., Wang, X., van Rechem, C., & Wei, W. (2023). Machine learning for synthetic data generation: a review. arXiv preprint arXiv:2302.04062.

[1] Hu, H., Salcic, Z., Sun, L., Dobbie, G., Yu, P. S., & Zhang, X. (2022). Membership inference attacks on machine learning: A survey. ACM Computing Surveys (CSUR), 54(11s), 1-37.

Strengths & Weaknesses

Strengths

  • Comprehensive attack taxonomy
  • Flexible generic attack model
  • Presentation of attack calibration is novel
  • Dataset choice is unique and relevent

Weaknesses

  • Paper doesn't address ongoing defense challenges extensively
  • Limited performance for white-box models
  • Some unrealistic assumption (white-box model) are made
  • Only AUCROC is used as a metric measurement which has been found to be "sensitive to the probability rank of members," [1]