As artificial intelligence continues to develop, researchers are facing challenges with fine-tuning large language models (LLMs). This process, which improves task performance and ensures that AI behaviors align with instructions, is costly because it requires significant GPU memory. This is especially problematic for large models like LLaMA 6.5B and GPT-3 175B.
To overcome these challenges, researchers from the Institute for Artificial Intelligence, Peking University, School of Intelligence Science and Technology, Peking University, and the National Key Laboratory of General Artificial Intelligence have developed a new method known as Principal Singular values and Singular vectors Adaptation (PiSSA).
This method is a type of parameter-efficient fine-tuning (PEFT) that reduces the number of parameters and memory usage needed without increasing the time it takes for the AI to process information. PiSSA utilizes a technical principle known as Singular Value Decomposition (SVD) to simplify the model’s matrices. The matrices are broken down into two trainable matrices, the principal singular values and vectors representing the model’s essential capabilities, along with a residual matrix that corrects for errors. PiSSA shares the same architecture as another PEFT method, low-rank adaptation (LoRA).
In their research, the team found that the PiSSA method performed better than LoRA. The research showed that fine-tuning the principal components of the model directly led to superior results. PiSSA was shown to converge more quickly and align more closely with the training data. Furthermore, the use of a technique called Fast SVD helped PiSSA maintain a balance between initialization speed and performance.
PiSSA shares some of the benefits of the LoRA method, including a reduction in the number of trainable parameters, easy deployment, and the ability to quantify the residual model. However, unlike LoRA, PiSSA’s early introduction means the model retains more of its capabilities, allowing it to focus exclusively on its primary functions during the fine-tuning process.
In conclusion, the research presents PiSSA as an efficient and effective fine-tuning technique for LLMs. By utilizing Singular Value Decomposition to initialize adapters with principal components, PiSSA achieves superior fine-tuning performance compared to the LoRA method, making it a promising approach to parameter-efficient fine-tuning.
You can find more details about the research in the full published paper and the code on Github. Please remember to follow on social media platforms like Twitter, join the Telegram and Discord channels, and subscribe to the newsletter and the 40k+ member ML SubReddit.