Baidu just dropped an open-source multimodal AI that it claims beats GPT-5 and Gemini

Baidu Inc., China’s largest search engine company, released a new artificial intelligence model on Monday that its developers claim outperforms competitors from Google and OpenAI on several vision-related benchmarks despite using a fraction of the computing resources typically required for such systems.

The model, dubbed ERNIE-4.5-VL-28B-A3B-Thinking, is the latest salvo in an escalating competition among technology companies to build AI systems that can understand and reason about images, videos, and documents alongside traditional text — capabilities increasingly critical for enterprise applications ranging from automated document processing to industrial quality control.

What sets Baidu’s release apart is its efficiency: the model activates just 3 billion parameters during operation while maintaining 28 billion total parameters through a sophisticated routing architecture. According to documentation released with the model, this design allows it to match or exceed the performance of much larger competing systems on tasks involving document understanding, chart analysis, and visual reasoning while consuming significantly less computational power and memory. — Read More

#performance