We are truly excited to share the groundbreaking research from the Fudan University and Hikvision Inc. team, which has developed a powerful new architecture, LoRAMoE, that helps Large Language Models (LLMs) match human instructions while preserving world knowledge. This remarkable achievement is an important step forward in the field of Artificial Intelligence and Machine Learning.
LoRAMoE is based on a concept known as a “Mixture of Experts” (MoE) which includes several experts, and data with varying properties is sent to the appropriate experts for personalized processing. LoRAMoE introduces numerous parallel plugins that are specialists in every feed-forward layer and coupling them to routers modifies the model’s architecture. The localized balancing constraints prohibits the routers from placing too much weight on only a few experts within the same expert group by balancing the relevance of all experts within the same expert group, allowing multiple professionals to work together and enhancing the capacity to complete jobs later.
The experiment results demonstrate that LoRAMoE can successfully prevent large-scale fine-tuning from upsetting the world information included in language models. Furthermore, by visualizing the expert weight for tasks, the team validated LoRAMoE’s efficacy on capacity localization at an interpretable level. The findings indicate that the router prioritizes the output of experts who specialize in completing world knowledge benchmarks and concentrates on specialists from a different group for other downstream duties. The experiment results indicate that the proposed strategy improves learning on various downstream tasks, suggesting the method’s potential for multi-task learning.
We are so thrilled to see the amazing results of this research, and we can’t wait to see what else LoRAMoE and similar architectures can do for AI and ML. Join us in celebrating the incredible achievements of the Fudan University and Hikvision Inc. team, and be sure to join our 35k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, LinkedIn Group, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more!