Skip to content Skip to footer

GraphStorm 0.3: User-friendly APIs offering scalable, multitasking learning on graphs.

GraphStorm, a low-code enterprise graph machine learning (GML) framework designed for building, training, and deploying GML solutions swiftly on complex, large-scale graphs, announces the launch of GraphStorm 0.3. The new version includes native support for multi-task learning on graphs, enabling users to define multiple training targets on different nodes and edges within a single training loop.

GraphStorm 0.3 supports six common tasks: node classification, node regression, edge classification, edge regression, link prediction, and node feature reconstruction, which can be specified via a YAML configuration file. This capacity allows deployment on a range of applications, from fraud detection to citation recommendations for scientific publications.

GraphStorm 0.3 also introduces new APIs for customizing GraphStorm pipelines, simplifying the training and inference pipeline customization process. With these APIs, it now takes only 12 lines of code to define a custom node classification training pipeline. To help users get started with these new functionalities, GraphStorm has published two Jupyter notebook examples, one for node classification and the other for link prediction tasks.

The technology also conducts a study using the Microsoft Academic Graph dataset, demonstrating the performance and scalability of GraphStorm on text-rich graphs. This study specifically looks into the co-training of language models (LM) and graph neural networks (GNN).

Utilizing the large graph dataset, GraphStorm demonstrates techniques to train LMs and GNN models together efficiently on massive text-rich graphs, optimizing performance via pre-trained BERT+GNN and fine-tuned BERT+GNN methods. The benchmarking results show that the fine-tuned BERT+GNN method delivers up to 40% better performance compared to the pre-trained version.

In conclusion, the release of GraphStorm 0.3, under the Apache-2.0 license, aims to assist users with large-scale graph ML challenges. It provides native support for multi-task learning and offers new APIs for customizing GraphStorm pipelines and components. Further information and support can be found within the GraphStorm GitHub repository, and within its documentation.

Leave a comment

0.0/5