Researchers from the Massachusetts Institute of Technology, University of Toronto, and Vector Institute for Artificial Intelligence have developed a new method called IF-COMP for improving the estimation of uncertainty in machine learning, particularly in deep learning neural networks. These fields place importance on not only accurately predicting outcomes but quantifying the uncertainty involved in these predictions, especially in high-stakes scenarios where the decisions made based on these predictions could have serious implications.
A key challenge in quantifying this uncertainty is ensuring that the model is reliable and well-calibrated under distribution shifts. This has been traditionally handled using Bayesian methods. However, these methods are confronted with challenges such as difficulty in specifying appropriate priors and scalability issues in deep learning, making them practically infeasible for usage in large-scale deep learning models.
At present, uncertainty estimation is achieved through applying various Bayesian methods and the Minimum Description Length (MDL) principle. Despite their theoretical soundness, Bayesian methods demand heavy computational resources and battle with defining suitable priors for more complex models. Conversely, the MDL principle offers a workaround by minimizing the combined codelength of models and observed data, which eliminates the need for explicit priors.
The proposed IF-COMP method offers a new way forward by approximating the predictive normalized maximum likelihood (pNML) distribution without overloading computational resources. This approach uses a Boltzmann influence function to linearize the model, leading to well-calibrated predictions and measuring complexity in both labeled and unlabeled settings. The method also penalizes the movement in the function and weight space, which allows the model to accommodate low-probability labels better.
The team validated the effectiveness of the IF-COMP method through a series of experiments, such as uncertainty calibration, mislabel detection, and out-of-distribution (OOD) detection. Results suggest that IF-COMP either matched or outperformed existing solutions with a computational efficiency that was 7-15 times faster than the ACNML.
As for performance, the method gleaned impressive results such as a lower expected calibration error across various corruption levels for uncertainty calibration on CIFAR-10 and CIFAR-100 datasets. In tests for mislabel detection, IF-COMP outperformed methods like Trac-IN, EL2N, and GraNd with a 96.86 area under the receiver operating characteristic curve (AUROC) for human noise on CIFAR-10 and 95.21 for asymmetric noise on CIFAR-100.
The findings show that IF-COMP has the capability to enable deep neural networks to estimate uncertainty more reliably. Furthermore, its demonstrated ability to tackle common computational challenges of existing methods like Bayesian and MDL provides a highly scalable alternative. With its superior performance proven across different tasks, IF-COMP stands to improve the safety of machine learning models and their respective applications.