Skip to content Skip to sidebar Skip to footer

New Releases

Unveiling MMS Zero-shot: An Innovative AI Model Capable of Transcribing Speech from Nearly Every Language Utilizing Minimal Unlabeled Text in the Novel Language

Speech recognition technology, a rapidly evolving area of machine learning, allows computers to understand and transcribe human languages. This technology is pivotal for services including virtual assistants, automated transcription, and language translation tools. Despite recent advancements, developing universal speech recognition systems that cater to all languages, particularly those that are less common and understudied, remains…

Read More

Google AI presents ShieldGemma: an extensive assembly of LLM-based models for safe content moderation, which is constructed on Gemma2.

Large Language Models (LLMs) have gained significant traction in various applications but they need robust safety measures for responsible user interactions. Current moderation solutions often lack detailed harm type predictions or customizable harm filtering. Now, researchers from Google have introduced ShieldGemma, a suite of content moderation models ranging from 2 billion to 27 billion parameters,…

Read More

Release Announcement: The Mistral-Large-Instruct-2407, a multilingual AI featuring a 128K context and proficiency in over 80 programming languages, has been launched. With an MMLU (Machine Learning Understanding) score of 84.0% and HumanEval score of 92%, along with solid 93% performance on the GSM8K test, this represents a significant advancement.

AI firm Mistral AI has launched the Mistral Large 2 model, its latest flagship AI model. The new iteration offers significant improvements on its predecessor, with considerable ability in code generation, mathematics, reasoning, and advanced multilingual support. Furthermore, Mistral Large 2 offers enhanced function-calling capabilities and is designed to be cost-efficient, high-speed, and high-performance. Users can…

Read More

The Athene-Llama3-70B Unveiled: A Non-Specific Weight LLM Developed with RLHF, Grounded on Llama-3-70B-Instruct.

Nexusflow has recently launched Athene-Llama3-70B, a high-performance open-weight chat model that's been fine-tuned from Meta AI's earlier model, Llama-3-70B. The improvement in terms of performance is quite significant with the new model achieving an impressive Arena-Hard-Auto score of 77.8%, surpassing models like GPT-4o and Claude-3.5-Sonnet. This is a substantial improvement from Llama-3-70B-Instruct, the predecessor which…

Read More

Mistral AI Unveils Codestral Mamba 7B: An Innovative Code LLM Scoring 75% on HumanEval for Python Programming

Mistral AI has announced the release of Codestral Mamba 7B, a cutting-edge language model (LLM) specializing in code generation and named in tribute to Cleopatra. Released under the Apache 2.0 license, Codestral Mamba 7B is freely available for use, modification, and distribution, a move that hopes to stimulate further developments in AI architecture research. This…

Read More

The unveiling of NuminaMath 7B TIR: Enhancing the Approach to Math Problems with Advanced Tool-Linked Thinking and Python REPL for High-level Precision in Competitions.

Numina has released a new language model optimized for solving mathematical problems: NuminaMath 7B TIR. With its 6.91 billion parameters, the model efficiently handles intricate mathematical queries through a specialized tool-integrated reasoning (TIR) system. Comprising a sequence of steps - creating a reasoning pathway for problem-solving, translating it into Python code, running the code in…

Read More

Tsinghua University Unveils Open-Sourced CodeGeeX4-ALL-9B: An Innovative Multilingual Code Generation Model Surpassing Key Rivals and Enhancing Code Assistance.

The Knowledge Engineering Group (KEG) and Data Mining team at Tsinghua University have revealed their latest breakthrough in code generation technology, named CodeGeeX4-ALL-9B. This advanced model, a new addition in the acclaimed CodeGeeX series, is a ground-breaking achievement in multilingual code generation, raising the bar for automated code generation efficiency and performance. A product of extensive…

Read More

InternLM2.5-7B-Chat: Bringing into Open Source the Large Language Models that excel in Logical Reasoning, Dealing with Extended Contexts, and Advanced Tool Utilization

InternLM has introduced its newest development in open large language models, InternLM2.5-7B-Chat, which is available in GGUF format. This latest model is compatible with the open-source framework, llama.cpp which is used for LLM inference and can be utilized both locally and in the cloud on different hardware platforms. The GGUF format provides half-precision and low-bit…

Read More

Jina AI Unveils Its Latest Version of Jina Reranker: A High-Performing, Multilingual Model for RAG and Retrieval with Improved Efficiency

Jina AI has launched a new advanced model, the Jina Reranker v2, aimed at improving the performance of information retrieval systems. This advanced transformer-based model is designed especially for text reranking tasks, efficiently reranking documents based on their relevance towards a particular query. The model operates on a cross-encoder model, taking a pair of query…

Read More