(2022) Preserving privacy with PATE for heterogeneous data.
Text
73_preserving_privacy_with_pate_f.pdf - Accepted Version Download (455kB) |
Abstract
Differential privacy has become the standard system to provide privacy guarantees for user data in machine learning models. One of the popular techniques to ensure privacy is the Private Aggregation of Teacher Ensembles (PATE) framework. PATE trains an ensemble of teacher models on private data and transfers the knowledge to a student model, with rigorous privacy guarantees derived using differential privacy. So far, PATE has been shown to work assuming the public and private data are distributed homogeneously. We show that in the case of high mismatch (non iid-ness) in these distributions, the teachers suffer from high variance in their individual training updates, causing them to converge to vastly different optimum states. This leads to lower consensus and accuracy for data labelling. To address this, we propose a modification to the teacher training process in PATE, that incorporates teacher averaging and update correction which reduces the variance in teacher updates. Our technique leads to improved prediction accuracy of the teacher aggregation mechanism, especially for highly heterogeneous data. Furthermore, our evaluation shows our technique is necessary to sustain the student model performance, and allows it to achieve considerable gains over the original PATE in the utility-privacy metric.
Item Type: | Conference or Workshop Item (A Paper) (Paper) |
---|---|
Divisions: | Sebastian Stich (SS) |
Conference: | NeurIPS-W Workshop on Neural Information Processing Systems |
Depositing User: | Sebastian Stich |
Date Deposited: | 05 May 2023 11:18 |
Last Modified: | 05 May 2023 11:18 |
Primary Research Area: | NRA1: Trustworthy Information Processing |
URI: | https://publications.cispa.saarland/id/eprint/3938 |
Actions
Actions (login required)
View Item |