No, Of Course I Can! Refusal Mechanisms Can Be Exploited Using Harmless Data
Summary
Workshop version: refusal mechanisms can be exploited through harmless fine-tuning data.

Workshop version: refusal mechanisms can be exploited through harmless fine-tuning data.