Skip to content Skip to footer

This AI article from the Netherlands presents an AutoML structure engineered for effective creation of comprehensive multimodal machine learning ML pipelines.

Automated Machine Learning (AutoML) has become crucial for data-driven decision-making, enabling experts to utilize machine learning without needing extensive statistical knowledge. However, a key challenge faced by current AutoML systems is the efficient and correct handling of multimodal data, which can consume significant resources.

Addressing this issue, scientists from the Eindhoven University of Technology have put forth a novel method using pre-trained Transformer models to improve AutoML’s capabilities in handling complex data modalities. The researchers designed a flexible search space for multimodal data, integrated pre-trained models into pipeline topologies strategically, and implemented warm-starting for SMAC using metadata from previous evaluations.

The pre-trained Transformer models help AutoML systems process across unimodal and multimodal data. To overcome multimodal data processing challenges, the researchers created a Combined Algorithm Selection and Hyperparameter Optimization (CASH) issue, crucial for achieving optimal AutoML performance. By addressing this issue, the scientists aimed to ensure the AutoML system was efficient and adaptable across different data modalities.

The researchers tested multimodal pipeline designs on tasks such as Visual Question Answering (VQA), Image Text Matching (ITM), regression, and classification using datasets from the text-vision and tabular-text-vision modalities. In addition, they built a meta-dataset by recording performances for each pipeline variation across a set of tasks.

Findings showed that the framework quickly converges to optimal configurations across different modalities, consistently produces high quality multimodal pipeline designs, and stays within computational limits. Additionally, comparisons with classical NAS methods showed that the new framework is more efficient.

Furthermore, the scientists have limited their research by predominantly using pre-trained models with their weights frozen and a warm-start technique. This entails using informed configurations from prior knowledge for the optimization process, meaning performance changes during optimization would mainly be due to hyperparameter tweaks, not changes to the pre-trained model’s weights.

In future work, the team plans to enhance the framework’s capabilities and broaden its utility to different scenarios, such as parameter-space sampling. The researchers will also continue studying the effect of hyperparameters on a static, pre-trained model’s performance.

Leave a comment

0.0/5