University of Waterloo researchers have introduced GenAI-Arena, a user-centric evaluation platform for generative AI models, filling a critical gap in fair and efficient automatic assessment methods. Traditional metrics like FID, CLIP, FVD provide insights into visual content generation but may not sufficiently evaluate user satisfaction and aesthetic qualities of generated outputs. GenAI-Arena allows users not…
Large Language Models (LLMs) are complex artificial intelligence tools capable of amazing feats in natural language processing. However, these large models require extensive fine-tuning to adapt to specific tasks, a process that usually involves adjusting a considerable number of parameters and consequently consuming significant computational resources and memory. This means the fine-tuning of LLMs is…
The field of document understanding, which involves transforming documents into meaningful information, has gained significance with the advent of large language models and increasing use of document images across industries. The primary challenge for researchers in this field, however, is the effective extraction of information from documents that contain a mix of text and visual…
Google’s mobile keyboard app, Gboard, uses statistical decoding to counteract the inherent inaccuracies of touch input on small screens, often referred to as the ‘fat finger’ problem. To assist users, Gboard has several features covering word completion, next-word predictions, active auto-correction and active key correction. However, these models do struggle with more complex errors which…
Retrieval Augmented Generation (RAG) is a method that aids Large Language Models (LLMs) in producing more accurate and relevant data by incorporating a document retrieval system. Current RAG solutions struggle with multi-aspect queries requiring diverse content from multiple documents. Standard techniques like RAPTOR, Self-RAG, and Chain-of-Note focus on data relevance but are not efficient in…
Human-computer interaction (HCI) is the study of how humans interact with computers, with a specific focus on designing innovative interfaces and technologies. One aspect of HCI that has gained prominence is the integration of large language models (LLMs) like OpenAI's GPT models into educational frameworks, specifically undergraduate programming courses. These AI tools have the potential…
Natural Language Processing (NLP) aims to enable computers to understand and generate human language, facilitating human-computer interaction. Despite advancements in NLP, large language models (LLMs) often fall short when it comes to complex planning tasks, such as decision-making and organizing actions - abilities crucial in a diverse array of applications from daily tasks to strategic…