Iterated Task Optimization Demonstration (DITTO): A Unique AI Approach that Matches Language Model Outputs Precisely with User's Displayed Actions

Stanford University researchers have developed a new method called Demonstration ITerated Task Optimization (DITTO) designed to align language model outputs directly with users’ demonstrated behaviors. This technique was introduced to address the challenges language models (LMs) face – including the need for big data sets for training, generic responses, and mismatches between universal style and application-specific preferences.

DITTO borrows ideas from online imitation learning and generates online comparison data inexpensively. It prioritizes users’ demonstrations over the output from LMs and intermediate checkpoints. The proposed method saw an average win rate of 19% points, outperforming techniques like supervised fine-tuning, few-shot prompting, and self-play, thereby offering a new and effective way to customize LMs.

Functioning across verticals such as news, emails, and blog posts, DITTO follows a three-step iterative process. First, limited supervised fine-tuning is executed depending on expert demonstrations; next, a new dataset is generated by sampling completions for each demonstration and added to the ranking over policies; finally, reinforcement learning with human feedback is deployed for policy updating.

The technique was tested using the GPT-4 eval tool, with an average win rate of 77.09% across CMCC (71.67%) and CCAT50 (82.50%). This represents an 11.7% average win rate improvement compared to other methods. User studies also showed that DITTO was more effective than other approaches.

While the Stanford researchers recognized the importance of demonstrations as feedback, they did not test bigger model sizes due to computational costs and called for further analysis to determine the types of preference data needed. This points to room for further advancements in the field.

All
Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All
Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All
Categories

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Iterated Task Optimization Demonstration (DITTO): A Unique AI Approach that Matches Language Model Outputs Precisely with User’s Displayed Actions

Leave a comment Cancel reply

You May Also Like

AI transcription devices can create damaging illusions.

This Research on AI Explores Massive Language Model (LLM) Pre-training Coupled with In-depth Examination of Downstream Capabilities

+60 12-462 2768

All Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All Categories

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Iterated Task Optimization Demonstration (DITTO): A Unique AI Approach that Matches Language Model Outputs Precisely with User’s Displayed Actions

Leave a comment Cancel reply

You May Also Like

AI transcription devices can create damaging illusions.

This Research on AI Explores Massive Language Model (LLM) Pre-training Coupled with In-depth Examination of Downstream Capabilities

+60 12-462 2768

All
Categories

All
Categories

All
Categories