A newly released open-source dataset could revolutionize the prediction and detection of tornadoes by using machine learning. Called TorNet, the Massachusetts Institute of Technology’s dataset is composed of radar returns from thousands of tornadoes in the last 10 years. Alongside the dataset, models trained on it, which demonstrate the capacity of machine learning to identify a tornado, were also shared.
Every year, roughly 1,200 tornadoes occur in the US, causing substantial economic damage and resulting in the loss of lives. However, predicting tornadoes is challenging due to the lack of understanding regarding their formation. Even though the basic ingredients for a tornado are known, such as thunderstorms, rapidly rising warm air, and rotating wind shear, it is still unclear why some storms lead to tornadoes while others with similar conditions do not.
Adding to the challenge, weather radars, the primary tools used to monitor these conditions, often cannot detect tornadoes because they occur too low to be spotted. Hence, these limitations lead to a high rate of false alarms, over 70%, when forecasters decide whether to issue a tornado warning.
The TorNet dataset could help in addressing this issue. It includes more than 200,000 radar images, including 13,587 of tornadoes. The rest of the images are non-tornadic and come from storms that either led to false alarms or were severe but randomly selected. The storm or tornado samples in the dataset consists of two sets of six radar images, corresponding to different radar sweep angles and portraying various radar data products.
Using this dataset, the researchers developed AI models and were keen to apply deep learning, a form of machine learning optimal for processing visual data. The deep learning model performed at similar levels or better than all known tornado-detecting algorithms. It accurately classified 50% of weaker tornadoes and over 85% of more severe ones.
The researchers aim for the provided models to be improved upon by the community and to inspire users to find innovative applications for the dataset. It could also help unravel the science of why tornadoes form, particularly with the use of explainable AI, which provides a model’s reasoning for its decision in a format that humans can understand. As the technology advances, distilled insights could guide forecasters in complex situations, providing a visual warning for areas predicted to have tornadic activity.
While the journey to a fully operational algorithm is long, especially in critical safety situations, public benchmark datasets like TorNet represent a crucial first step. It allows for a trust-building process between forecasters and machine learning, as researchers across the world will be encouraged to develop their own algorithms. In turn, these will be tested and eventually shown to forecasters and introduced into operations. By reducing the false-alarm rate, the researchers believe headway could be made in public perception and encourage people to take lifesaving action more certainly.