Want to create interactive content? It’s easy in Genially!

[Slide] midpoint report

Kyra Zhou

Created on January 20, 2023

Start designing with a free template

Discover more than 1500 professional designs like these:

Tech Presentation Mobile

Geniaflix Presentation

Vintage Mosaic Presentation

Shadow Presentation

Newspaper Presentation

Zen Presentation

Audio tutorial

Explore all templates

Backdoor attacks on NLP prompting

Mid-point report

@ kyraz

Backdoor attacks on NLP prompting

Mid-point report

@ kyraz

Backdoor attacks on NLP prompting

Mid-point report

@ kyraz

Before NLP prompting ...

Pre-trained language model

Downstream task

A big neural network
Pre-trained on large corpus like Wikipedia
Guess next word or sentence
e.g., BERT, RoBERTa

End-user task
Application of NLP
e.g., sentiment analysis on movie reviews, hate speech detection model

Before NLP prompting ...

Pre-trained language model

Downstream task

Fine-tuning (add an extra neural network layer)

A big neural network
Pre-trained on large corpus like Wikipedia
Guess next word or sentence
e.g., BERT, RoBERTa

End-user task
Application of NLP
e.g., sentiment analysis on movie reviews, hate speech detection model

Before NLP prompting ...

Pre-trained language model

Downstream task

Fine-tuning (add an extra neural network layer)

Problem lack of labelled datasets

Prompt-based learning

Manual, Auto, differential prompts

Manual Discrete prompt

Automated Discrete prompt

Automated Differential prompt

Efficient
Only allow discrete words

Lack interpretability

Intuitive, easy to understand
Time-consuming
Sub-optimal

Continuous space
Interpretable

Flexible control

Are Auto and Diff better than Manual?

* SST2: A binary sentiment analysis task on movie reviews

* QNLI: A binary textual entailment task on question-answer pairs

Backdoor attack

Assumptions:

Attackers have access to the pre-trained language model (PLM)
Attackers do not know the particular downstream task
A successful attack preserves a high class discrimination score, but once the trigger is inserted, gives a high misclassified proportion of samples

Backdoor attack performance

MNLI-MATCHED

Poison triggers:["cf", "mn", "bb", "qt", "pt", "mt"]

Auto

Differential

Manual

Progress so far ...

PART 1 - Manual, Auto, Differential promptingPART 2 - Backdoor attacks

What's the next step?

Could a research project be ... (e.g., implementation-heavy)?
What's your biggest takeaway?

When is the latest time your supervisors replied to your email/messages?

... ...

If you don't have any questions... Here are some questions you may ask :)

Any Questions?

Appendix

Backdoor attack

Assumptions:

Attackers have access to the pre-trained language model (PLM)
Attackers do not know the particular downstream task
A successful attack preserves a high class discrimination score, but once the trigger is inserted, gives a high misclassified proportion of samples

Backdoor attack performance

MNLI-MATCHED

Poison triggers:["cf", "mn", "bb", "qt", "pt", "mt"]

Why Auto prompting performs badly?

_____

________

___

_____

_______

_____

___

____

___

____

___

_______

__

Manual, Auto, differential prompts

Manual Discrete prompt

Automated Discrete prompt

Less time-consuming
Only allow discrete words

Lack interpretability

Intuitive, easy to understand
Time-consuming
Sub-optimal

Manual, Auto, differential prompts

Manual Discrete prompt

Intuitive, easy to understand
Time-consuming
Sub-optimal

Differential prompting

Auto prompting

Auto prompting - verbaliser

Are Auto and Diff better than Manual?

* SST2: A binary sentiment analysis task on movie reviews

Are Auto and Diff better than Manual?

* QNLI: A binary textual entailment task on question-answer pairs

Are Auto and Diff better than Manual?

View

Tech Presentation Mobile

View

Geniaflix Presentation

View

Vintage Mosaic Presentation

View

Shadow Presentation

View

Newspaper Presentation

View

Zen Presentation

View

Audio tutorial

[Slide] midpoint report

Start designing with a free template

View

Tech Presentation Mobile

View

Geniaflix Presentation

View

Vintage Mosaic Presentation

View

Shadow Presentation

View

Newspaper Presentation

View

Zen Presentation

View

Audio tutorial

Transcript

* SST2: A binary sentiment analysis task on movie reviews

* QNLI: A binary textual entailment task on question-answer pairs

MNLI-MATCHED

Auto

Differential

Manual

MNLI-MATCHED

_____

________

___

___

___

_____

_______

_____

___

____

____

____

___

____

___

_______

__

* SST2: A binary sentiment analysis task on movie reviews

* QNLI: A binary textual entailment task on question-answer pairs

* TWEETS-HATE-OFFENSIVE: A safety-critical multi-class hate/offensive speech detection task

MNLI-MATCHED Auto

K = 16

K = 1000

K = 100

MNLI-MATCHED Differential

K = 16

K = 1000

K = 100

MNLI-MATCHED Manual

K = 16

K = 1000

K = 100