Chinese AI tech giant, SenseTime, announced a major upgrade for their flagship product SenseNova 5.5 at the 2024 World Artificial Intelligence Conference & High-Level Meeting on Global AI Governance. The update incorporates the first real-time multimodal model in China, SenseNova 5o, and demonstrates a commitment to providing innovative and practical applications in various industries.
SenseNova 5o introduces a new AI interaction framework and supports a multimodal data processing and response system, encompassing audio, text, image, and video formats. Comparable to the streaming interaction capabilities of GPT-4, it offers a user experience akin to human conversation. This improves its utility for real-time conversation and speech recognition applications and demonstrates its adaptability and the ability to provide meaningful contextual responses.
Significantly, SenseNova 5.5 also sports a cost-effective edge-side large model, which drastically cuts the cost per device to as low as RMB 9.90 per year. This makes the technology more accessible to a broader range of users and industries. SenseTime’s product matrix ensures regular updates, paving the way for innovative solutions for generative applications across different contexts and industries. Over 3,000 government and corporate customers from the technology, healthcare, finance, and programming sectors have already deployed the SenseNova Large Model.
Dr. Xu Li, SenseTime CEO, underscored the importance of this upgrade and regards it as a turning point in the evolution of large models. The focus on boosting interactivity and the continuous development of multimodal streaming interactions will lead to significant transformations in human-AI interactions, according to Dr Li.
SenseNova 5.5’s technical efficiency is reinforced by a hybrid cloud-edge collaborative expert architecture, which optimizes synergy and reduces inference costs. There is a 30% improvement in overall performance compared to SenseNova 5.0, with better mathematical reasoning, English proficiency, and command-following.
Furthermore, SenseTime launched SenseChat Lite-5.5, featuring a quicker inference time and a higher inference speed for better performance and efficiency. The edge-side model product matrix includes tailored models such as the SenseChat Mini Writing Assistant, the Summary Assistant, and the Encyclopedia Assistant, each catering to specialized business needs.
Another introduction is Vimi, SenseTime’s controllable AI avatar video generator. It can create short videos with control over facial expressions and upper body movements, making it an ideal tool for generating long-form videos for entertainment and interactive applications.
SenseTime also unveiled an initiative named “Project $0 Go” to provide an onboarding bundle for enterprise users migrating from the OpenAI platform. This package includes 50 million tokens and API migration consulting services.
Founded in 2014, SenseTime’s 10-year journey has led to a robust full-stack large model product matrix covering cloud-to-edge applications. With continuous development and expansion of the SenseNova industry ecosystem, SenseTime is committed to empowering more businesses and communities in their digital transformation initiatives.