Artificial intelligence (AI) continues to make significant strides forward with the development of Viking, a cutting-edge language model designed to cater to Nordic languages alongside English and a range of programming languages. Developed by Silo AI, Europe’s largest private AI lab in partnership with the TurkuNLP research group at the University of Turku and HPLT, the release of the Viking’s first models marks a significant step towards making European language models more accessible.
The model is an improved version of Silo AI’s previous language learning model (LLM), known as “Poro”. Viking’s architecture is updated and modernized providing coverage for a broad range of languages, including Danish, English, Finnish, Norwegian, Icelandic, and Swedish, along with programming languages. The Viking model comes as three entities – Viking 7B, 13B, and 33B – showcasing Silo AI’s commitment towards the enhancement of the European digital infrastructure, promotion of linguistic diversity, and fostering a technology environment that seeks to bring people together rather than creating cultural divides.
The model serves as a technological and cultural bridge, embodying Europe’s wider strategy to fortify its digital sovereignty without compromising on linguistic diversity. Emphasizing linguistic inclusivity and cultural sensitivity, Viking is engineered to excel in languages that are typically underrepresented in the global AI landscape. This creates a platform for innovation and dialogue within and beyond the Nordic region.
Viking showcases impressive performance especially in handling under-resourced languages. After training with 1,000 billion tokens, it has been observed to outperform other equivalent open models in lesser-used languages while maintaining efficiency in English and programming languages.
The architecture of Viking takes inspiration from successful models like Llama 2 which comprises features such as flash attention, rotary embeddings, and grouped query attention. It supports a hefty sequence length of up to 4,000 backed by a colossal dataset of 2 trillion tokens. The data extends over an array of Nordic and programming languages, marking a crucial stride towards creating a multilingual and multifunctional LLM.
Viking comes under the Apache 2.0 License, making it freely accessible for commercial and research uses. However, Silo AI emphasizes the need for additional fine-tuning and training prior to Viking’s deployment in production environments.
In essence, Viking represents an important shift towards promoting linguistic diversity in AI development, It exhibits exceptional performance in multilingual content understanding and generation. The open-access model, courtesy of the Apache 2.0 License, fosters innovation across sectors by providing a top-quality, multilingual LLM. The model’s development also underlines the importance of creating technologies that respect and incorporate local cultures and values. The introduction of Viking aligns with Europe’s strategy to boost its digital infrastructure and autonomy, positioning it as a global leader in the AI space.