Sampling from complex and high-dimensional target models, like the Boltzmann distribution, is critical in various spheres of science. Often, these models have to handle Combinatorial Optimization (CO) problems, which deal with finding the best solutions from a vast pool of possibilities. Sampling in such scenarios can get intricate due to the inherent challenge of obtaining unbiased samples. Much of these complexities arise from the fact that CO problems often engage with discrete target distributions, which require categorical distribution products for approximation—however, these products lack expressivity.
The study discusses numerous existing methods, including Variational Autoencoders and Diffusion Models. They also discuss neural optimization, a technique that employs neural networks to pinpoint the most suitable solution for a given problem. Additionally, Approximate Likelihood Models in Neural Probabilistic Optimization and Neural Combinatorial Optimization methods are analysed.
A team of researchers from Johannes Kepler University, Austria, ELLIS Unit Linz, and NXAI GmbH, present a novel method deemed Diffusion for Unsupervised Combinatorial Optimization (DiffUCO). This method facilitates the utilization of latent variable models like diffusion models in the problem of data-free approximation of discrete distributions. DiffUCO uses an upper bound on the reverse Kullback-Leibler divergence as a loss function, which sees improvement with an increase in diffusion steps during training.
DiffUCO tackles the challenges in CO effectively, providing top performance across many benchmarks. Together with a novel method called Conditional Expectation (CE), an advanced version of a regularly employed sampling technique, it enables the production of quality solutions to CO problems efficiently. This combination creates an efficient and more generalized method of using latent variable models for approximating data-free discrete distributions. Two types of discrete noise distributions are applied due to the discrete nature of UCO: Categorical Noise Distribution and Annealed Noise Distribution.
When tested on data sets like the Maximum Independent Set(MIS) and Minimum Dominating Set (MDS), DiffUCO outperformed other models even in its variants. However, it poses as memory- and time- intensive when trained on larger datasets with considerable connectivity, indicating that future improvements should target these areas.
To sum it all up, the DiffUCO method introduces a novel way of handling discrete distributions using latent variable models such as diffusion models. It provides superior performance compared to recent models across a broad range of benchmarks. Its quality of solution enhanced when variational annealing and additional diffusion steps are incorporated during inference. The challenge with this model lies in its memory and time consumption when dealing with extensive datasets, pointing towards a need to focus future research in this area to make the model more efficient.