Skip to content Skip to sidebar Skip to footer

AI News

Improving Industrial Anomaly Identification with RealNet: A Comprehensive AI Framework for Realistic Anomaly Creation and Effective Feature Reconstruction

Researchers from Capital Normal University and the School of Artificial Intelligence at Beijing University of Posts and Telecommunications have developed RealNet, a new feature reconstruction framework for industrial image anomaly detection. This approach addresses ongoing issues with generating diverse, realistic anomalies that align with natural distributions, as well as challenges around feature redundancy and pre-training…

Read More

Tyler Perry pauses $800 million expansion of his studio due to Open AI’s Sora.

Tyler Perry, an acclaimed film producer, recently revealed that he has postponed his $800 million expansion plans for his Atlanta studio indefinitely. The decision comes in the wake of OpenAI's latest technological innovation, a text-to-video model called Sora. Initially unveiled on February 15, 2024, OpenAI's Sora allows users to convert text prompts into video images. This…

Read More

Introducing Devin, the world’s first completely self-reliant AI software engineer, as disclosed by Cognition.

US-based startup Cognition has introduced Devin, the world's first fully autonomous AI software engineer on March 17, 2024. Devin harnesses AI power capable of resolving engineering tasks independently with its built-in shell, code editor, and web browser. One of the key features of Devin is its proficiency in fixing bugs on GitHub autonomously. Cognition has demonstrated…

Read More

Presenting the Future of AI Perception: KAIST Scientists Develop Breakthrough MoAI Model, Using External Computer Vision Learning to Establish a Connection between Visual Perception and Comprehension.

A research team from the Korea Advanced Institute of Science and Technology (KAIST) has contributed to the field of machine interpretation and interaction which amalgamates AI’s language understanding and visual perception, with the development of MoAI. The model utilizes auxiliary visual information from specialized computer vision (CV) models, which provides a more nuanced understanding of…

Read More

Researchers from Google DeepMind enhance visual-language models using artificial captions and image embeddings, a process entitled ‘Synth2’.

Visual Language Models (VLMs), which are powerful tools for processing visual and textual data, can face difficulties due to limited data availability. Recent research developments have shown that pre-training these models on larger image-text datasets can enhance their performance in downstream tasks. However, creating these datasets can be challenging because of paired data scarcity, high…

Read More

Apple has purchased the AI company DarwinAI, a move set to enhance Tim Cook’s AI ambitions.

In a move that aligns with its AI focus, Apple has acquired DarwinAI, an AI-focused Canadian startup. The purchase of the startup, which is yet to be officially announced by Apple, purportedly occurred earlier this year. DarwinAI's strength lies in its development of AI systems for visual inspection of components during manufacturing processes. The company's…

Read More

Surpassing Pixels: Amplifying Digital Artistry through Image Generation Based on the Subject

Image generation from textual descriptions has revolutionized the way technology intersects with creativity. A domain that has garnered interest is subject-driven image generation. Its potential lies in creating personalized images of specific subjects from a minimal set of examples. Yet, the inability to fully capture and depict detailed attributes of a given subject within its…

Read More

Teenagers from Miami apprehended for producing nude pictures of their peers using artificial intelligence.

On March 14th, 2024, two teenage students from Miami, Florida, aged 13 and 14, were arrested for allegedly creating and sharing explicit images of their classmates using artificial intelligence (AI). The juveniles, who were students at Pinecrest Cove Academy, reportedly used an unnamed AI application to generate and circulate the non-consensual pictures of their peers,…

Read More

The AI study by Stability AI and Tripo AI presents the TripoSR Model for swift FeedForward 3D formation from a single photo.

In the field of 3D generative AI, a new dimension has emerged whereby 3D reconstruction can occur from limited views. Propelled by large-scale 3D datasets and advances in generative model topologies, research has been spearheaded into using 2D diffusion models to create 3D objects from input texts or photos. This is primarily to address the…

Read More

Visualizing and Listening: Integrating Visual and Auditory Realms through Artificial Intelligence

The creation of lifelike images, videos, and sounds using artificial intelligence (AI) has significantly progressed recently. However, most of these developments have been focused on single modalities, ignoring the inherent multimodal nature of our world. In addressing this, researchers have introduced a novel optimization-based framework designed to seamlessly integrate visual and audio content creation. By…

Read More

The investment in AI Search Engine by Jeff Bezos is predicted to double in the upcoming months.

Perplexity AI, a startup launched in August 2022, is aspiring to compete with Google in the search engine sector. The company's technology merges the capabilities of a chatbot and a traditional search engine, and its innovation is gaining investment from figures including Amazon founder, Jeff Bezos. In the first few months of 2024, Perplexity AI…

Read More

Researchers from UNC-Chapel Hill have developed a new AI method known as Contrastive Region Guidance (CRG), which doesn’t require training, and provides open-source vision-language models (VLMs) the ability to respond to visual cues.

Modern vision-language models (VLMs) have made significant progress in providing solutions for multimodal tasks by merging the reasoning abilities of large language models (LLMs) and visual encoders like ViT. Nevertheless, despite their impressive performance in tasks involving entire images, these models often struggle with the fine-grained region grounding, inter-object spatial relations, and compositional reasoning. They…

Read More