Read the Paper
Tag Archives: Image Recognition
Meta claims its new art-generating model is best-in-class
… Today, Meta announced CM3Leon (“chameleon” in clumsy leetspeak), an AI model that the company claims achieves state-of-the-art performance for text-to-image generation. CM3Leon is also distinguished by being one of the first image generators capable of generating captions for images, laying the groundwork for more capable image-understanding models going forward, Meta says.
“With CM3Leon’s capabilities, image generation tools can produce more coherent imagery that better follows the input prompts,” Meta wrote in a blog post shared with TechCrunch earlier this week. “We believe CM3Leon’s strong performance across a variety of tasks is a step toward higher-fidelity image generation and understanding.” — Read More
StyleDrop: Text-To-Image Generation in Any Style
We present StyleDrop that enables the generation of images that faithfully follow a specific style, powered by Muse, a text-to-image generative vision transformer. StyleDrop is extremely versatile and captures nuances and details of a user-provided style, such as color schemes, shading, design patterns, and local and global effects. StyleDrop works by efficiently learning a new style by fine-tuning very few trainable parameters (less than 1% of total model parameters), and improving the quality via iterative training with either human or automated feedback. Better yet, StyleDrop is able to deliver impressive results even when the user supplies only a single image specifying the desired style. An extensive study shows that, for the task of style tuning text-to-image models, Styledrop on Muse convincingly outperforms other methods, including DreamBooth and Textual Inversion on Imagen or Stable Diffusion. — Read More
Paragraphica – Context to image (AI) camera
Created by Bjørn Karmann, Paragraphica is a camera that utilizes location data and AI to visualize a “photo” of a specific place and moment. The camera exists both as a physical prototype and an online camera that you can try. — Read More
This AI-Powered, Point-Based Photo Manipulation System is Wild
Researchers have developed a point-based image manipulation system that uses generative artificial intelligence (AI) technology to allow users to precisely control the pose, shape, expression, and layout of objects.
The research outlines how users can control generative adversarial networks (GANs) with intuitive graphical control. The technology is called DragGAN. — Read More
StableStudio is Stability AI’s latest commitment to open-source AI
Stability AI has announced StableStudio, a new open-source variant of its DreamStudio AI text-to-image web app.
Stability AI is releasing an open-source version of DreamStudio, a commercial interface for the company’s AI image generator model, Stable Diffusion. In a press statement on Wednesday, Stability AI said the new release — dubbed StableStudio — “marks a fresh chapter” for the platform and will serve as a showcase for the company’s “dedication to advancing open-source development.” — Read More
Stability AI releases an open source text-to-animation tool
You’ve heard of text-to-image, but have you heard of text-to-animation?
From anime to childhood classics, animations have brought stories to life by combining still images. Now, with just a text prompt, you can generate your own animations using AI.
On Thursday, Stability AI, the AI company that created Stable Diffusion, unveiled a text-to-animation tool that allows developers and artists to use Stable Diffusion models to generate animations. — Read More
Google’s open-source AI tool let me play my favorite Dreamcast game with my face
Project Gameface is ready to install as a Windows app that makes gaming more accessible using only your webcam.
While Wednesday’s Google I/O event largely hyped the company’s biggest AI initiatives, the company also announced updates to the machine learning suite that powers Google Lens and Google Meet features like object tracking and recognition, gesture control, and of course, facial detection. The newest update enables app developers to, among other things, create Snapchat-like face filters and hand tracking, with the company showing off a GIF that’s definitely not a Memoji.
This update underpins a special project announced during the I/O developer keynote: an open-source accessibility application called Project Gameface, which lets you play games… with your face. During the keynote, Google played a very Wes Anderson-esque mini-documentary revealing a tragedy that prompted the company to design Gameface. — Read More
MidJourney Has Competition (And It’s Free To Use)!
Midjourney 5.1 Arrives – And It’s Another Leap Forward For AI Art
Midjourney 5.1 has been released, bringing another significant improvement in the quality of results from the generative AI art service.
The company claims that version 5.1 of the engine is “more opinionated”, bringing it closer to the kind of results that you would get with version 4 of Midjourney, but at a higher quality. There’s also a “raw” mode, for those who don’t want images that are as strongly opinionated.
Other claimed improvements include greater accuracy, fewer unwanted borders or text artifacts in images, and improved sharpness. Read More