Large Language Models (LLMs) have become essential tools in various industries due to their superior ability to understand and generate human language. However, training LLMs is notably resource-intensive, demanding sizeable memory allocations to manage the multitude of parameters. For instance, the training of the LLaMA 7B model from scratch calls for approximately 58 GB of…