Skip to content Skip to footer

Enhancing extensive neural network training on CPUs with ThirdAI and AWS Graviton.

Artificial intelligence (AI) startup ThirdAI has examined the potential for AWS Graviton3 processors to boost the speed of neural network training on its unique CPU-based deep learning engine. The company developed BOLT, a sparse deep learning engine designed to train and deploy models on standard CPU hardware rather than costly GPUs.

The company tested the Graviton3 processor against an Intel Ice Lake processor and the Nvidia T4G GPU. The tests involved extreme multi-label classification with the Amazon-670K product recommendation task; the Yelp polarity sentiment analysis benchmark using the Universal Deep Transformers (UDT) model; and multi-class text classification on the DBPedia benchmark.

In all conducted evaluations, it was observed that AWS Graviton3 accelerated UDT performance by roughly 40% and achieved a similar level of accuracy to the DistilBERT transformer-based model fine-tuned on a GPU. Furthermore, the latency was reported to be sub-millisecond.

Moreover, ThirdAI found its models performed well with search, recommendation, and natural language processing tasks usually characterized by large, high-dimensional output spaces and a need for extremely low inference latency. The company also mentioned that they are further developing their methods for other areas, including computer vision.

ThirdAI didn’t require any customization for BOLT to run on AWS Graviton3, and the software is compatible with all major CPU architectures. The company plans to pass the gains achieved with AWS Graviton3 to their customers, offering faster training and improved performance on low-cost CPUs, leading to potential cost savings.

Founded in 2021, ThirdAI is committed to democratizing AI technologies by changing the deep learning economics through algorithmic and software innovations. Their focus is on developing systems for resource-efficient deep learning. Their proprietary dynamic sparse algorithms activate only a subset of neurons for a given input, eliminating the need for dense computations, and utilize locality-sensitive hashing to select neurons dynamically.

Leave a comment

0.0/5