Large Language Models (LLMs), such as GPT-3 and ChatGPT, have been shown to exhibit advanced capabilities in complex reasoning tasks, outpacing standard, supervised machine learning techniques. The key to unlocking these enhanced abilities is the incorporation of a 'chain of thought' (CoT), a method that replicates human-like step-by-step reasoning processes. Importantly, the use of CoT…
Language models, a subset of artificial intelligence, are utilized in a myriad of applications including chatbots, predictive text, and language translation services. A significant challenge faced by researchers in Artificial Intelligence (AI) is making these models more efficient while also enhancing their ability to comprehend and process large amounts of data.
Imperative to the field of…
Artificial intelligence models, in particular large language models (LLMs), have made significant strides in generating coherent and contextually appropriate language. However, they sometimes create content that seems correct but is actually inaccurate or irrelevant, a problem often referred to as "hallucination". This can pose a considerable issue in areas where high factual accuracy is critical,…
Maritime transport has a key role in worldwide trade and travel, but the unpredictability of global waters presents various difficulties. However, the inception of autonomous ships could revolutionise maritime navigation. These ships, also known as Maritime Autonomous Surface Ships (MASS), combine advanced sensors and Artificial Intelligence (AI) to improve situational awareness and ensure safer navigation.…
The rapidly evolving field of research addressing hallucinations in vision-language models (VLVMs), or artificially intelligent (AI) systems that generate coherent but factually incorrect responses, is increasingly gaining attention. Especially important when applied in crucial domains like medical diagnostics or autonomous driving, the accuracy of the outputs from VLVMs, which combine text and visual inputs, is…
Artificial Intelligence (AI) systems, such as Vision-Language Models (VLVMs), are becoming increasingly advanced, integrating text and visual inputs to generate responses. These models are being used in critical contexts, such as medical diagnostics and autonomous driving, where accuracy is paramount. However, researchers have identified a significant issue in these models, which they refer to as…
Mixture-of-experts (MoE) architectures, designed for better scaling of model sizes and more efficient inference and training, present a challenge to optimize due to their non-differentiable, discrete nature. Traditional MoEs use a router network which directs input data to expert modules, a process that is complex and can lead to inefficiencies and under-specialization of expert modules.…
Large Language Models (LLMs) play a crucial role in computational linguistics. However, their enormous size and the massive computational demands they require make deploying them very challenging. To faciliate simpler computations and boost model performance, a process of "quantization" is used, which simplifies the data involved. Traditional quantization techniques convert high-precision numbers into lower-precision integers,…
The recent development of large language models (LLMs), which can generate high-quality content across various domains, has revolutionized the field of natural language creation. These models are fundamentally of two types: those with open-source model weights and data sources, and those for which all model-related information, including training data, data sampling ratios, logs, checkpoints, and…
Generative Pre-trained Transformers (GPT) have significantly transformed the gaming industry, from game development to gameplay experiences. This is according to a comprehensive review that draws from 55 research articles published between 2020 and 2023, as well as other papers.
GPT's application in Procedural Content Generation (PCG) allows for increased creativity and efficiency in game development. For…
In the world of medical technology, the use of large language models (LLMs) is becoming instrumental, largely due to their ability to analyse and discern copious amounts of medical text, providing insight that would typically require extensive human expertise. The evolution of such technology could lead to substantial reductions in healthcare costs and broaden access…
Robotics traditionally operates within two dominant architectures: modular hierarchical policies and end-to-end policies. The former uses rigid layers like symbolic planning, trajectory generation, and tracking, whereas the latter uses high-capacity neural networks to directly connect sensory input to actions. Large language models (LLMs) have rejuvenated the interest in hierarchical control architectures, with researchers using LLMs…