ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

Driven by improved architectures and better representation learning frameworks, the field of visual recognition has enjoyed rapid modernization and performance boost in the early 2020s. For example, modern ConvNets, represented by ConvNeXt, have demonstrated strong performance in various scenarios. While these models were originally designed for supervised learning with ImageNet labels, they can also potentially benefit from self-supervised learning techniques such as masked autoencoders (MAE). However, we found that simply combining these two approaches leads to subpar performance. In this paper, we propose a fully convolutional masked autoencoder framework and a new Global Response Normalization (GRN) layer that can be added to the ConvNeXt architecture to enhance inter-channel feature competition. This co-design of self-supervised learning techniques and architectural improvement results in a new model family called ConvNeXt V2, which significantly improves the performance of pure ConvNets on various recognition benchmarks, including ImageNet classification, COCO detection, and ADE20K segmentation. We also provide pre-trained ConvNeXt V2 models of various sizes, ranging from an efficient 3.7M-parameter Atto model with 76.7% top-1 accuracy on ImageNet, to a 650M Huge model that achieves a state-of-the-art 88.9% accuracy using only public training data. — Read More

#image-recognition

Introducing ChatGPT Enterprise

We’re launching ChatGPT Enterprise, which offers enterprise-grade security and privacy, unlimited higher-speed GPT-4 access, longer context windows for processing longer inputs, advanced data analysis capabilities, customization options, and much more. We believe AI can assist and elevate every aspect of our working lives and make teams more creative and productive. Today marks another step towards an AI assistant for work that helps with any task, is customized for your organization, and that protects your company data. — Read More

#chatbots

Alibaba opens AI model Tongyi Qianwen to the public

Alibaba said on Wednesday it would open its artificial intelligence model Tongyi Qianwen to the public, in a sign it has gained Chinese regulatory approval to mass-market the model.

Authorities in China have recently accelerated efforts to support companies developing AI as the technology increasingly becomes a focus of competition with the United States. — Read More

#big7

California lawmakers want to protect actors from being replaced by artificial intelligence

As Hollywood actors and writers continue to strike for better pay and benefits, California lawmakers are hoping to protect workers from being replaced by their digital clones.

On Wednesday, Assemblymember Ash Kalra (D-San José) was expected to introduce a bill that would give actors and artists a way to nullify provisions in vague contracts that allow studios and other companies to use artificial intelligence to digitally clone their voices, faces and bodies. — Read More

#legal, #vfx