K2 is an advanced large language model (LLM) by LLM360, produced in partnership with MBZUAI and Petuum. This model, dubbed K2-65B, comprises 65 billion parameters and is completely reproducible, meaning that all components, including the code, data, model checkpoints, and intermediate results, are open-source and available to anyone. The main aim of this level of transparency is to clarify the training recipe used for similar models, like Llama 2 70B – enabling a better understanding of the development process and performance metrics.
The creation of K2 was a combined effort amongst several notable institutions, including MBZUAI, Petuum, and LLM360. This collaboration capitalized on the institutions’ expertise as well as their resources to create an advanced language model recognized for its performance and transparency. This model is accessible under the Apache 2.0 license, allowing widespread use and further development by the technological community.
LLM360 has provided a comprehensive set of evaluations for K2, both general and domain-specific benchmarks. These evaluations span medical, mathematical, and coding knowledge, guaranteeing that the model excels over a diverse range of tasks and domains. The LLM360 Performance and Evaluation Collection, along with the K2 Weights and Biases project, provide an in-depth analysis of K2’s operation.
K2’s training utilized a variety of datasets to achieve results that could rival that of the Llama 2 70B model. The training process comprised two stages, making full use of datasets such as dm-math, PubMed-abstracts, uspto, among others – a total of 1.3 trillion tokens. This comprehensive data mix led to K2 achieving a wide-ranging understanding and ability over various topics and languages.
LLM360 has made K2’s intermediate checkpoints accessible, permitting researchers and developers to monitor the model’s gradual development and enhancement over time. This provision forms part of K2’s fully reproducible feature, enhancing transparency and driving further research and developmental strides. Tutorials for replicating the pretraining and finetuning process are also offered, beneficial to academic and industry researchers alike.
Furthermore, LLM360 is an open research lab promoting community-owned artificial general intelligence (AGI) through open-source large model research and development. Their goal is to create an open ecosystem with equitable computational resources, high-quality data, and a steadily advancing body of technical knowledge to ensure ethical AGI development that is accessible to all. The overarching ambition of LLM360 is to boost innovators by elevating the capabilities of large language models while encouraging a collaborative environment for research and development.
In conclusion, K2 by LLM360 offers transparency, high-level performance, and a robust development framework. By promoting open-source collaboration and broad evaluation, K2 establishes a new benchmark for LLM development. It is ensuring ethical practices and broad accessibility for future innovations in artificial intelligence.