Skip to content Skip to sidebar Skip to footer

Machine learning

Speed up NLP interpretation using ONNX Runtime on AWS Graviton processors

ONNX is an open-source machine learning framework, offering interoperability across various platforms. It collaborates with ONNX Runtime, the runtime engine for model inference and training. AWS Graviton3 processors are specifically tailored for machine learning tasks and support a series of instructions to optimize performance. The ONNX Runtime 1.17.0 release integrates some of these instructions, improving…

Read More

An innovative method enables AI chatbots to engage in conversations all day without experiencing errors or shutdowns.

A team of researchers from MIT, Meta AI, Carnegie Mellon University, and NVIDIA, have found a solution to the problem of the performance degradation of AI chatbots during extended human-AI conversations. They identified a challenge associated with AI conversation memory, known as the key-value cache, where data is bumped out when the cache exceeds its…

Read More

Improving the dependability of language models by leveraging concepts from game theory.

Researchers from MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) have designed a new type of game to enhance how artificial intelligence (AI) comprehends and produces text. This "consensus game" includes two parts of an AI system - the part that generates sentences and a part that evaluates those sentences. This model significantly improved the…

Read More

Vidur: An Extensive Simulation Platform Transforming LLM Deployment by Reducing Expenses and Enhancing Efficiency

Large Language Models (LLMs) such as GPT-4 and LLaMA2-70B enable various applications in natural language processing. However, their deployment is challenged by high costs and the need to fine-tune many system settings to achieve optimal performance. Deploying these models involves a complex selection process among various system configurations and traditionally requires expensive and time-consuming experimentation.…

Read More

MISATO: A Dataset of Protein-Ligand Complexes for Structure-Based Drug Discovery Using Machine Learning

Artificial Intelligence (AI) technology researchers from multiple institutions including the Institute of Structural Biology, Technical University of Munich, and others have developed a novel approach to drug discovery, named MISATO. This innovative model is designed to enhance the process of drug design, a critical aspect within the broader field of computational chemistry and structural biology.…

Read More

How ‘Chain of Thought’ Enhances the Intelligence of Transformers

Large Language Models (LLMs), such as GPT-3 and ChatGPT, have been shown to exhibit advanced capabilities in complex reasoning tasks, outpacing standard, supervised machine learning techniques. The key to unlocking these enhanced abilities is the incorporation of a 'chain of thought' (CoT), a method that replicates human-like step-by-step reasoning processes. Importantly, the use of CoT…

Read More