Stanford researchers present Score Entropy Discrete Diffusion (SEDD): A machine learning model that contests the autoregressive language pattern and outperforms GPT-2 in terms of complexity and quality.

Artificial Intelligence (AI) and Deep Learning have made significant advancements, particularly in the area of generative modelling, a subfield of Machine Learning. Here, models are trained to produce new data samples that match the training data. Generative AI systems have shown remarkable capabilities, such as creating images from written descriptions and solving complex problems. Autoregressive modelling is a crucial aspect of these deep generative models, particularly in Natural Language Processing (NLP). This technique breaks down a sequence into the probabilities of each of its components to predict the sequence’s likelihood. However, drawbacks of autoregressive transformers include challenging output control and delayed text production.

To overcome these limitations, researchers have turned to alternative text generation models, including diffusion models previously used in image production. These models convert random noise into structured data but struggle to surpass autoregressive models in terms of speed, efficiency, and quality. A team of researchers have introduced the Score Entropy Discrete Diffusion models (SEDD) as a solution. This unique model, using a loss function known as score entropy, bases its parameterization of a reverse discrete diffusion process on data distribution ratios. This technique is suitable for discrete data, such as text, and draws inspiration from score-matching algorithms typical of diffusion models.

SEDD matches existing language diffusion models in basic language modelling and can compete with standard autoregressive models. It outperformed models like GPT-2 in zero-shot perplexity challenges, demonstrating its efficiency. The researchers found that it can produce high-quality text samples and balance processing power and output quality. It achieves comparable results to GPT-2 but with less computational power. Additionally, SEDD allows more control over the text production process through explicit parameterization of probability ratios. It performs well in conventional and infill text generation scenarios compared to both diffusion and autoregressive models. It enables text generation from any starting point, eliminating the need for specialized training.

In conclusion, SEDD challenges the dominance of autoregressive models and signals an improvement in generative modeling for Natural Language Processing. It quickly produces high-quality text with greater control, opening new doors for AI. The credit for this research goes to the project’s researchers. For more information about this work, you can visit the project’s official paper, Github, and blog. Also, it’s recommended to follow them on social media platforms like Twitter and Google News for the latest updates. Various AI courses are also available for free for those interested in learning more about the field.

All
Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All
Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All
Categories

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Stanford researchers present Score Entropy Discrete Diffusion (SEDD): A machine learning model that contests the autoregressive language pattern and outperforms GPT-2 in terms of complexity and quality.

Leave a comment Cancel reply

You May Also Like

Sprinklr enhances efficiency by 20% and decreases expenses by 25% for machine learning inference on AWS Graviton3.

Nvidia AI Unveils Minitron 4B and 8B: A Fresh Range of Compact Language Models Offering 40x Quicker Model Training through Pruning and Distillation

+60 12-462 2768

All Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All Categories

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Stanford researchers present Score Entropy Discrete Diffusion (SEDD): A machine learning model that contests the autoregressive language pattern and outperforms GPT-2 in terms of complexity and quality.

Leave a comment Cancel reply

You May Also Like

Sprinklr enhances efficiency by 20% and decreases expenses by 25% for machine learning inference on AWS Graviton3.

Nvidia AI Unveils Minitron 4B and 8B: A Fresh Range of Compact Language Models Offering 40x Quicker Model Training through Pruning and Distillation

+60 12-462 2768

All
Categories

All
Categories

All
Categories