Researchers from the University of Oxford and University College London have developed Craftax, a reinforcement learning (RL) benchmark that unifies effective parallelization, compilation, and the removal of CPU to GPU transfer in RL experiments. This research seeks to address the limitations educators face in using tools such as MiniHack and Crafter due to their prolonged runtime. While these tools have led to advancements in RL algorithms, they’re not viable options for current methodology because they don’t utilize large-scale computing resources.
Craftax operates at a speed that is significantly faster than comparable benchmarks, and showcases complex, open-ended dynamics. Craftax-Classic, a re-implementation of Crafter using JAX, has been proven to perform 250 times faster than the original Python version. An essential feature of Craftax is that a basic PPO agent can resolve it in 51 minutes, as a higher amount of timesteps are readily available.
The researchers also provide a more challenging version of the tool, called Craftax, which includes gaming mechanisms from NetHack and the Roguelike genre. This Craftax environment offers a variety of new game mechanisms, adding another layer of representation learning to the problem through the usage of pixels. However, to keep the runtime fast, the researchers offer symbol-based observation options as well. Test results show that current approaches do not perform well on Craftax, presenting an opportunity for future RL research.
The research team anticipates Craftax-Classic will serve as an accessible introduction to the Craftax benchmark for those familiar with the standard Crafter. The success of the project comes as a boom in RL environments has been observed due to the functionality of JAX having been fully recognized.
The researchers believe Craftax will allow experimentation with limited computational resources, providing a significant challenge for the future of RL studies. The benchmark’s performance has proven faster due to the successful fusion of the different RL schools of thought.
The paper, GitHub, and project can be viewed online, and they encourage interactions on their social platforms. They provide free AI courses, and have a regular newsletter for those interested in their work.