Harvard researchers have launched ReXrank, an open-source leaderboard that aims to improve artificial intelligence (AI)-powered radiology report generation. This development could revolutionize healthcare AI, especially concerning chest X-ray image interpretation. ReXrank aims to provide a comprehensive, objective evaluation framework for advanced AI models, encouraging competition and collaboration among researchers, clinicians, and AI enthusiasts and accelerating progress in this critical field.
ReXrank uses diverse datasets, including MIMIC-CXR, IU-Xray, and CheXpert Plus, to offer a robust benchmarking system that changes according to clinical needs and technological progress. The leaderboard showcases top-performing models that could potentially improve patient care and streamline medical workflows. Furthermore, it encourages the development and submission of new models, looking to push boundaries in medical imaging and report generation.
The leaderboard is designed to provide clear, transparent evaluation criteria. Researchers can access the evaluation script and a sample prediction file to conduct their assessments. As a result, all submissions will be evaluated consistently and fairly. Interested individuals can find the evaluation script on the ReXrank GitHub repository and test their models using the provided datasets.
Key among the datasets used is the MIMIC-CXR collection, with over 377,000 images from more than 227,000 radiographic studies carried out at Beth Israel Deaconess Medical Center in Boston. Various metrics, such as FineRadScore, RadCliQ, BLEU, BertScore, SembScore, and RadGraph, rank the models. Top-performing models include MedVersa, CheXpertPlus-mimic, and RaDialog.
Another dataset is the IU X-ray dataset, which includes 7,470 pairs of radiology reports and chest X-rays from Indiana University. The leaderboard for this dataset ranks models based on performance across multiple metrics. Leading models include MedVersa, RGRG, and RadFM.
CheXpert Plus, a dataset with 223,228 unique pairs of radiology reports and chest X-rays from over 64,000 patients, is also part of ReXrank. The leaderboard for CheXpert Plus ranks models based on their performance on the valid set, with MedVersa, RaDialog, and CheXpertPlus-mimic recognized for their outstanding results.
Researchers looking to participate in ReXrank are encouraged to develop their models, run the evaluation script, and submit their predictions for official scoring. A tutorial on ReXrank’s GitHub repository simplifies the submission process.
In conclusion, the introduction of ReXrank by Harvard provides a transparent, objective, and comprehensive evaluation tool designed to inspire innovation and collaboration. Everyone is invited to participate, from researchers to AI enthusiasts, fostering the development of medical imaging and report generation technology.