Chinese tech firm, Kuaishou Technology, has launched a text-to-video (T2V) generator named ‘Kling,’ which may compete with OpenAI’s unreleased ‘Sora.’ Kuaishou, based in Beijing, cultivates content sharing platforms that speed up the process of content production, distribution, and consumption. Their short video platform, also named ‘Kuaishou’, is second only to TikTok in terms of daily active users.
The need to reduce reliance on human-generated content for their platforms may have been the catalyst behind the development of their T2V tool. Kling changes text prompts into temporally and spatially coherent videos of high quality. According to Kuaishou, Kling can generate videos of up to two minutes in length, with a resolution of 1080p at 30 frames per second. This is a minute longer than what OpenAI has stated that Sora can produce. Significantly, Kling has been released to the public, while Sora is still not available.
Like Sora, Kling operates on a diffusion transformer architecture. This offers the powerful technology that reconstructs 3D faces and bodies, capable of using a full-body image as a prompt to create a video with smooth limb movements. The early demo videos showcase impressive motion, but the longer clips typically have more scenic elements with fewer dynamic parts, possibly suggesting some restrictions of the platform. But Kling excels in generating natural-looking gestures, such as fingers moving or mouth actions, where AI typically struggles.
Notably, the release of Kling speaks to a broader dichotomy in global AI development approaches. As Western companies like OpenAI grapple with questions of AI safety, privacy, and disinformation issues, China is moving ahead with technological innovations despite US sanctions intended to restrict its advancements. While OpenAI may be focused on ensuring Sora aligns with safety or ‘political correctness,’ Chinese AI technologies could fill in the gap for reliable T2V tools if no solution is found in the USA soon.