Skip to content Skip to footer
Search
Search
Search

National University of Singapore Researchers Create a Groundbreaking RMIA Method for Improving Privacy Risk Assessment in Machine Learning

Privacy in machine learning models has become a critical concern due to the emergence of Membership Inference Attacks (MIA). MIA attempts to reveal whether specific data points were part of a given model’s training data. Understanding this type of attack is pivotal as it assesses the unintentional exposure of information when models are trained on diverse datasets. MIA encompasses various scenarios, ranging from statistical models to sophisticated federated and privacy-preserving machine learning. Initially rooted in summary statistics, MIA methods have evolved, incorporating various hypothesis testing strategies and approximations, especially in deep learning algorithms.

Previous MIA approaches have faced numerous challenges. Despite improvements in attack effectiveness, computational demands have made many privacy audits unfeasible. Some of the most advanced methods, particularly for generalized models, come close to random guessing when restricted by computation resources. Furthermore, the lack of clear, interpretable means for comparing different attacks has led to their mutual dominance, where each attack outperforms the other based on various scenarios. This complexity calls for the development of stronger yet more efficient attacks to evaluate privacy risks accurately. The computational expense associated with existing attacks has limited their practicality, highlighting the need for novel strategies that can achieve high performance within constrained computation budgets.

In this context, a new paper was published to propose an innovative attack approach within the realm of Membership Inference Attacks (MIA). MIA, aiming to uncover whether a particular data point was used during the training of a given machine learning model θ, is depicted as an indistinguishability game between a challenger (algorithm) and an adversary (privacy auditor). This involves scenarios where a model θ is trained with or without the data point x. The adversary’s mission is to infer, based on x, the trained model θ, and their knowledge of the data distribution, which situation they are positioned in within these two worlds.

The new Membership Inference Attack (MIA) methodology introduces a sophisticated approach to construct two distinct worlds where x may be either a member or non-member of the training set. Unlike prior methods that simplify these constructions, this novel attack meticulously composes the null hypothesis by substituting x with random data points from the population. This design leads to numerous pairwise likelihood ratio tests to determine x’s membership relative to other data points z. The attack aims to collect substantial evidence favoring x’s presence in the training set over a random z, offering a more nuanced analysis of leakage. This unique method computes the likelihood ratio corresponding to x and z, distinguishing between scenarios where x is a member and non-member through a likelihood ratio test.

Named Relative Membership Inference Attack (RMIA), this methodology leverages population data and reference models to increase attack potency and robustness against adversary background knowledge variations. It introduces a refined likelihood ratio test that accurately measures the distinguishability between x and any z based on shifts in their probabilities when conditioned on θ. Unlike existing attacks, this method ensures a more calibrated approach, avoiding dependencies on uncalibrated magnitude or overlooking essential calibration with population data. Through a meticulous pairwise likelihood ratio computation and a Bayesian approach, RMIA emerges as a robust, high-power, cost-effective attack, outperforming prior state-of-the-art methods across various scenarios.

The authors compared RMIA against other membership inference attacks using datasets like CIFAR-10, CIFAR-100, CINIC-10, and Purchase-100. RMIA consistently outperformed other attacks, especially with a limited number of reference models or in offline scenarios. Even with few models, RMIA showed results that were comparable to those of scenarios with more models. With abundant reference models, RMIA maintained a slight edge in AUC and notably higher TPR at zero FPR compared to LiRA. Its performance improved with more queries, showcasing its effectiveness in a variety of scenarios and datasets.

To sum up, the article presents RMIA, a Relative Membership Inference Attack method, highlighting its superiority over existing attacks in identifying membership within machine learning models. RMIA excels in scenarios with limited reference models, demonstrating robust performance across various datasets and model architectures. In addition, its efficiency makes RMIA a practical and viable choice for privacy risk analysis, especially in scenarios where resource constraints are a concern. Its flexibility, scalability, and the balanced trade-off between accuracy and false positives make RMIA a reliable and adaptable method for membership inference attacks, offering promising applications in privacy risk analysis tasks for machine learning models. We are truly excited to see the immense potential of this groundbreaking RMIA technique for enhanced privacy risk analysis in machine learning!

Leave a comment