How AI Will Completely Dominate the Animation Industry In Less Than 5 Years

If you’re looking to get into animation as a career, you have less than 5 years.

Thinking of animation as a career? You have less than half a decade to do something meaningful.

Why?

  1. DALL-E 2 and other AI art models can now produce a near-infinite variety of illustrations using a simple text prompt. By 2025, they’ll outperform human artists on every metric.
  2. AI animation models already exist that can take a static illustration and “imagine” different movements, poses, and frames. You can make the Mona Lisa smile, laugh, or cry — and there’s nothing stopping you from doing that to other images, too.
  3. AI video models are right around the corner. Soon, studios will be able to create smooth videos of any framerate with nothing more than a text prompt. Short films will be next.

Read More

#vfx

Attention in the Human Brain and Its Applications in ML

Some objects grab our attention when we see them, even when we are not exactly looking for them. How precisely does this happen? And, more importantly, how can we incorporate this phenomena to improve our computer vision models? In this article, I will explain the process of paying attention to salient (i.e. noticeable) objects in the visual scene and their applications in Machine Learning as an AI researcher (or not only from the neuroscience perspective).

Visual perception, saliency, and attention have been active research topics in neuroscience for decades. The discoveries and advancements that these researchers have made have helped AI researchers understand and mimic the process(es) in the human brain. Indeed, saliency and attention are active research topics in the AI community, too. The outcome is a wide spectrum of applications ranging from better language understanding to autonomous driving. But before we can understand the AI perspective on attention, we’ll first have to understand it from the neuroscience perspective. Read More

#human

META Introduces Make-A-Video: An AI system that generates videos from text

Today, we’re announcing Make-A-Video, a new AI system that lets people turn text prompts into brief, high-quality video clips. Make-A-Video builds on Meta AI’s recent progress in generative technology research and has the potential to open new opportunities for creators and artists. The system learns what the world looks like from paired text-image data and how the world moves from video footage with no associated text. As part of our continued commitment to open science, we’re sharing details in a research paper and plan to release a demo experience. Read More

#big7, #image-recognition, #nlp

Your ML setup is not unique: you don’t need more data scientists

We’ve been long working on diverse set of ML projects, and we see the same decisions taken and same mistakes made again and again. But ML is commoditizing, and there is no way to escape it.

… As the industry gains more experience within the area, a couple of common open-source tools are created. They might implement only 90% of SaaS/in-house solutions feature set, but it’s more than enough for 90% of the industry. Welcome to the commoditized world, where you can do ML without writing Python code. Read More

#strategy

NVIDIA Launches Large Language Model Cloud Services to Advance AI and Digital Biology

NVIDIA NeMo LLM Service Helps Developers Customize Massive Language Models; NVIDIA BioNeMo Service Helps Researchers Generate and Predict Molecules, Proteins, DNA

NVIDIA today announced two new large language model cloud AI services — the NVIDIA NeMo Large Language Model Service and the NVIDIA BioNeMo LLM Service — that enable developers to easily adapt LLMs and deploy customized AI applications for content generation, text summarization, chatbots, code development, as well as protein structure and biomolecular property predictions, and more.

The NeMo LLM Service allows developers to rapidly tailor a number of pretrained foundation models using a training method called prompt learning on NVIDIA-managed infrastructure. The NVIDIA BioNeMo Service is a cloud application programming interface (API) that expands LLM use cases beyond language and into scientific applications to accelerate drug discovery for pharma and biotech companies. Read More

#nvidia, #nlp

Robotic coffee barista maker led by ex-AWS engineer raises $8.3M to open more retail locations

A Seattle coffee company is aiming to change the way lattes and espressos are made.

No, it’s not Starbucks. It’s Artly, a 2-year-old startup that just raised $8.3 million to fuel growth of its robotic baristas.

Artly has developed an AI-powered machine that it claims makes a “perfect cup of coffee every time,” using computer vision algorithms to guide a robotic arm and monitor drink quality. It has five retail locations across the West Coast and will use the new funding to expand its model. Read More

#robotics

QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars

Real-time tracking of human body motion is crucial for interactive and immersive experiences in AR/VR. However, very limited sensor data about the body is available from standalone wearable devices such as HMDs (Head Mounted Devices) or AR glasses. In this work, we present a reinforcement learning framework that takes in sparse signals from an HMD and two controllers, and simulates plausible and physically valid full body motions. Using high quality full body motion as dense supervision during training, a simple policy network can learn to output appropriate torques for the character to balance, walk, and jog, while closely following the input signals. Our results demonstrate surprisingly similar leg motions to ground truth without any observations of the lower body, even when the input is only the 6D transformations of the HMD. We also show that a single policy can be robust to diverse locomotion styles, different body sizes, and novel environments. Read More

#image-recognition

IARPA Kicks off Research Into Linguistic Fingerprint Technology

The Intelligence Advanced Research Projects Activity (IARPA), the research and development arm of the Office of the Director of National Intelligence, today announced the launch of a program that seeks to engineer novel artificial intelligence technologies capable of attributing authorship and protecting authors’ privacy.

The Human Interpretable Attribution of Text Using Underlying Structure (HIATUS) program represents the Intelligence Community’s latest research effort to advance human language technology. The resulting innovations could have far-reaching impacts, with the potential to counter foreign malign influence activities; identify counterintelligence risks; and help safeguard authors who could be endangered if their writing is connected to them. Read More

#privacy, #ic

Beyond Jupyter Notebooks: MLOps Environment Setup & First Deployment

Read More
#mlops, #videos

DeepMind’s new chatbot uses Google searches plus humans to give better answers

The lab trained a chatbot to learn from human feedback and search the internet for information to support its claims.

The trick to making a good AI-powered chatbot might be to have humans tell it how to behave—and force the model to back up its claims using the internet, according to a new paper by Alphabet-owned AI lab DeepMind. 

In a new non-peer-reviewed paper out today, the team unveils Sparrow, an AI chatbot that is trained on DeepMind’s large language model Chinchilla.

Sparrow is designed to talk with humans and answer questions, using a live Google search to inform those answers. Based on how useful people find those answers, it’s then trained using a reinforcement learning algorithm, which learns by trial and error to achieve a specific objective. This system is intended to be a step forward in developing AIs that can talk to humans without dangerous consequences, such as encouraging people to harm themselves or others. Read More

#chatbots