Skip to content Skip to footer

BRAG Unveils: Small Language Models (SLMs) Optimized for RAG Tasks Available for Under $25 each.

The BRAG series is a set of high-performance Retrieval Augmented Generation (RAG) models developed by Maximalists AI Researcher. They are a small language model designed to be a low-cost alternative for AI-driven language processing, proving effective in artificial intelligence due to their affordability and cost-effectiveness. They were created to meet the need for more powerful language models that surpass foundational models at reduced cost and computational resources generally required by larger scale entities like Nvidia and OpenAI.

The series comprises four models: BRAG-Qwen2-7b-v0.1, BRAG-Llama-3.1-8b-v0.1, BRAG-Llama-3-8b-v0.1, and BRAG-Qwen2-1.5b-v0.1 that were selected based on their performance in open benchmarks and their ability to balance efficiency and capability. Their initial training was on general instruction datasets, as inspired by Nvidia’s ChatQA methodology, trailed by RAG-specific datasets.

Their demonstrations are significant in the world of AI due to their comparatively small size and their diverse functionality. The 1.5B models offer a balance between performance and efficiency, while the 7B and 8B models process complex tasks such as the interpretation of tabular data, long context understanding, and mathematical reasoning.

The training of the BRAG models involved LoRA (Low-Rank Adaptation) and QLoRA (quantized LoRA), which reduce computational demands and memory footprints. They are evaluated using the ChatRAG-Bench, a standard developed to assess conversational Question Answer abilities and RAG capabilities across different document types and question formats. The models were assessed by metrics such as F1-Score and Exact Match Accuracy, which analyzes the models’ precision and contextual relevance.

Obstacles met during the training process included handling long documents and domain-specific queries. Solutions were implemented through experimentation with different data combinations, and the inclusion of robust datasets like DROP, Quoref, and SQuAD improved their capacity to process complex and diverse data types.

Looking forward, Maximalists plans to refine query rewriting techniques and enhance the BRAG models by improving RAG performance and handling of tabular data, in addition to introducing citation generation for better interpretability. The development of BRAG was made possible by credits from Modal Labs, which facilitated cost-effective experimentation. In summation, BRAG has exemplified that top-tier performance can be achieved with minimal resource expenditure, setting the stage for more accessible AI solutions.

Leave a comment

0.0/5