Machine learning is a crucial domain where differential privacy (DP) and selective classification (SC) play pivotal roles in safeguarding sensitive data. DP adds random noise to protect individual privacy while retaining the overall utility of the data, while SC chooses to refrain from making predictions in cases of uncertainty to enhance model reliability. These components are especially essential in privacy-sensitive applications like healthcare and finance. However, there are various challenges in preserving model reliability and accuracy under privacy constraints.
A recent paper presented at the NeurIPS conference proposes novel solutions to these challenges. The paper addresses the issue of decreased predictive efficiency in machine learning models due to DP’s implications. Through a detailed empirical evaluation, the researchers identified the shortcomings of the existing selective classification approaches under DP.
A new technique proposed in the paper, Selective Classification via Training Dynamics Ensembles (SCTD), veers away from traditional ensemble methods. Unlike standard ensemble techniques, which incur high privacy costs due to the aggregation of different privacy guarantees, SCTD leverages intermediate predictions made during the training process to form the ensemble. The technique mitigates privacy leakage by examining disagreement among these middle-speculation to pinpoint and eliminate anomalous data points. This innovative methodology enhances the reliability of selective classifiers while simultaneously addressing the challenges posed by DP.
Additionally, the paper introduces a new accuracy-normalized selective classification metric that allows an equitable comparison of selective classification methods across varying degrees of privacy. This metric helps to overcome the drawbacks of existing evaluation methods.
The researchers conducted experimental evaluations to assess SCTD and compared it with other selective classification methods across various datasets and privacy levels. The SCTD method demonstrated promising trade-offs between selective classification accuracy and privacy budgets. However, the researchers identified the need for further theoretical analyses and exploration of strategies to balance privacy and fairness at a subgroup level.