Are you looking for an efficient way to assess visual content? Look no further than Q-ALIGN, the breakthrough methodology developed by researchers from Nanyang Technological University, Shanghai Jiao Tong University, and SenseTime Research. This new approach represents a major paradigm shift in the domain of visual content assessment, as it educates Large Multi-Modality Models (LMMs) to understand and use text-defined rating levels for visual rating. This is much more akin to how human raters evaluate and judge in subjective studies, rather than regressing from mean opinion scores (MOS).
Q-ALIGN is designed to overcome the inadequacy of traditional methods, which lack the ability to generalize effectively to novel content types or diverse scoring scenarios. As a result, it has achieved state-of-the-art performance in IQA, IAA, and VQA tasks. In particular, it has demonstrated superior performance in out-of-distribution settings, indicating its effectiveness in accurately assessing a wide range of visual content.
The process of Q-ALIGN is intricate and carefully designed. It converts existing score labels into discrete text-defined rating levels during the training phase, which is analogous to how human raters learn and judge in subjective studies. In the inference phase, Q-ALIGN extracts the log probabilities on different rating levels and employs softmax pooling to obtain the close-set probabilities of each level. The final score is then derived from a weighted average of these probabilities, mirroring how human ratings are converted into MOS in subjective visual assessments.
Q-ALIGN’s ability to generalize effectively to new types of content underlines its potential for broad application across various fields. This novel methodology offers a robust, accurate, and more intuitive tool for scoring diverse types of visual content. The work addresses the limitations of existing methods and opens up new possibilities for future advancements in the field. Check out the Paper and Github for more information on this exciting breakthrough. Don’t forget to follow us on Twitter, join our 35k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group to stay up to date on the latest developments. If you like our work, be sure to sign up for our newsletter to get even more news and updates on Q-ALIGN and other exciting ML projects!