Meta has announced the release of upgraded versions of its Llama 3.1 models, spanning 8B, 70B, and 405B variations. The improvements include support in eight languages and an expanded context length of 128k. The 405B model, deemed the largest and most capable foundation model that is freely available, stands out in particular. Its high functionality arises from meticulous training with over 15 trillion tokens and 16,000 NVIDIA H100s, costing hundreds of millions in computing expenses.
Deviating from speculation of the 405B model being the first paid model due to its high computational costs, Meta has made the Llama 3.1 models freely available for download and modification. The models come along with a set of services from Amazon, Databricks, and NVIDIA and are also provided by cloud services including AWS, Azure, Google, and Oracle. Additionally, the benchmark figures put Llama 3.1 405B in tight competition with other leading models such as GPT-4o and Claude 3.5 Sonnet, following testing on over 150 benchmark datasets.
Meta’s CEO, Mark Zuckerberg, reiterated the company’s trust in open-source AI, presenting it as the future industry standard. He defended against open-source criticism, arguing that an open-source approach could prevent the development of harmful emergent behaviors that closed models might overlook. He also dismissed concerns over adversaries improving their AI models using Meta’s open-source models, suggesting that no efforts to prevent them would work. Though these models currently offer a significant advancement in AI, there are murmurs about GPT-5 and Claude 3.5 Opus that might eclipse these benchmark decisions.