Mistral AI has announced the release of Codestral Mamba 7B, a cutting-edge language model (LLM) specializing in code generation and named in tribute to Cleopatra. Released under the Apache 2.0 license, Codestral Mamba 7B is freely available for use, modification, and distribution, a move that hopes to stimulate further developments in AI architecture research. This advanced model builds on the success of the company’s Mixtral family and is part of their ongoing commitment to pioneering advancements in AI architectures.
Codestral Mamba 7B sets itself apart from traditional Transformer models through its linear time inference feature and its theoretical capability to model sequences of infinite length. This allows users to engage with the model extensively, receiving quick responses regardless of input length. This efficiency is especially beneficial for coding applications, positioning Codestral Mamba 7B as a powerful tool for boosting code productivity.
Engineered for advanced code and reasoning tasks, Codestral Mamba 7B’s performance is on par with state-of-the-art (SOTA) Transformer-based models, tested and proven to possess exceptional in-context retrieval capabilities. The model can handle up to 256k tokens, making it an excellent local code assistant. It also boasts an impressive parameter count of 7,285,403,648, a testament to its technical proficiency. This robust configuration ensures high performance and reliability for various coding and AI tasks.
For deployment, developers have several options. The model can be deployed using the mistral-inference software development kit (SDK), through TensorRT-LLM, and soon through local inference support in llama.cpp. Its raw weights are available for download from HuggingFace, ensuring widespread accessibility.
Furthermore, Codestral Mamba 7B is also available on “la Plateforme” alongside its counterpart, Codestral 22B, under both commercial and community licenses. This dual availability ensures that both individual developers and larger enterprises can take advantage of these advanced models.
Codestral Mamba 7B is an instructed model, designed to handle complex instructions and deliver accurate outputs, making it an invaluable asset for developers. The release of this new model underscores Mistral AI’s dedication to advancing AI technology and providing accessible, high-performance tools to the developer community. By offering Codestral Mamba 7B under an open-source license, Mistral AI is fostering innovation and collaboration within the AI research and development fields, contributing greatly to the development of intelligent coding assistants. The Codestral Mamba 7B can, therefore, be considered a cornerstone in this area.
As a final word, Mistral AI celebrates its achievement of this innovative AI model and encourages others to explore its benefits. The company thanks its researchers for their hard work and invites people to join them on social media platforms such as Twitter, LinkedIn, and their Telegram Channel, as well as subscribing to their newsletter and joining their ML SubReddit. The Codestral Mamba 7B stands as a testament to Mistral AI’s commitment to AI technologies that push the boundaries of what’s possible in the field of code generation.