Sign language research is aimed at improving technology to better understand and interpret sign languages used by Deaf and hard-of-hearing communities globally. This involves creating extensive datasets, innovative machine-learning models, and refining tools for translation and identification for numerous applications. However, due to the lack of standardized written form for sign languages, there is a lack of ample data, especially concerning lesser-studied sign languages, hindering the development of effective translation tools and machine learning models.
Some methods for processing sign languages include specialized datasets like YouTube-ASL for American Sign Language (ASL) and BOBSL for British Sign Language (BSL). Despite these datasets being valuable resources, they are often limited to individual languages and involve labor-intensive processes for manual annotations. There is a need for more scalable methods to cater to the vast diversity of sign languages.
To address this, researchers from Google and Google DeepMind have introduced YouTube-SL-25, an extensive multilingual corpus of sign language videos. This dataset, comprised of over 3,000 hours of video content from over 3,000 unique signers across 25 sign languages, is the largest and most diverse resource of its kind. The dataset creation involved automatic classifiers for identifying potential sign language videos from YouTube, followed by an auditing process to categorize videos based on content quality and alignment.
Fluency with this dataset was demonstrated in benchmark tests using a unified multilingual multitask model based on T5. The model was extended to support multiple languages, enhancing its capabilities in sign language translation and identification. The analysis showed significant improvements, particularly in high-resource and low-resource sign languages.
Composed of 3,207 hours of video content across more than 25 sign languages, YouTube-SL-25 offers detailed statistics to evaluate its performance. The scale of this dataset allows for a more comprehensive representation of sign languages, even including those with at least 15 hours of content, to ensure better support for low-resource languages. Over 3,000 unique channels highlight the diversity of sign languages in this dataset.
Overall, the introduction of YouTube-SL-25 is a significant step forward in sign language research, addressing data scarcity issues and facilitating the development of more effective translation and interpretation tools. By enabling better pretraining for sign-to-text translation models and enhancing sign language identification tasks, it offers a positive impact on the Deaf and hard-of-hearing communities worldwide, taking significant strides towards technology advancement for broader understanding and accessibility.