Rick's Cafe AI 9:29 am on April 6, 2023
Tags: Vision

Introducing Segment Anything: Working toward the first foundation model for image segmentation

Segmentation — identifying which image pixels belong to an object — is a core task in computer vision and is used in a broad array of applications, from analyzing scientific imagery to editing photos. But creating an accurate segmentation model for specific tasks typically requires highly specialized work by technical experts with access to AI training infrastructure and large volumes of carefully annotated in-domain data.

Today, we aim to democratize segmentation by introducing the Segment Anything project: a new task, dataset, and model for image segmentation, as we explain in our research paper. We are releasing both our general Segment Anything Model (SAM) and our Segment Anything 1-Billion mask dataset (SA-1B), the largest ever segmentation dataset, to enable a broad set of applications and foster further research into foundation models for computer vision. We are making the SA-1B dataset available for research purposes and the Segment Anything Model is available under a permissive open license (Apache 2.0). Check out the demo to try SAM with your own images. Read More

#vision

Rick's Cafe AI 4:40 pm on November 10, 2021
Tags: Vision

The Future Direction And Vision For AI

This article sets out the journey of Artificial Intelligence (AI) and the interrelationship with the arrival of the “era of Big Data” alongside 3G and 4G telecoms networks. This will discuss or explore how we arrived at where we are now and also where we are going to next with the era of even bigger albeit increasingly decentralised data in the era of AI meets the IoT (AIoT) and standalone 5G networks that may arrive in the next few years. Read More

#vision

Rick's Cafe AI 3:23 pm on November 2, 2021
Tags: Robotics ( 205 ), Vision

New technology gives smart cars ‘x-ray’-like vision

Detects hidden pedestrians, cyclists

Share Australian researchers have developed a technology that allows autonomous vehicles to track moving pedestrians hidden behind buildings and cyclists obscured by cars, trucks, and buses.

The autonomous vehicle uses game changing tools that allows it to ‘’see the world around it using x-ray style vision that penetrates through to pedestrian blind spots.

The technology has been developed as part of a project funded by the iMOVE Cooperative Research Centre in collaboration with the University of Sydney’s Australian Centre for Field Robotics and Australian connected vehicle company Cohda Wireless. iMove has today released its new findings in a final report following three years of research and development. Read More

#robotics, #vision

Rick's Cafe AI 9:07 pm on March 24, 2021
Tags: Image Recognition ( 313 ), Vision

AI backpack concept gives audio alerts to blind pedestrians

When Jagadish Mahendran heard about his friend’s daily challenges navigating as a blind person, he immediately thought of his artificial intelligence work.

“For years I had been teaching robots to see things,” he said. Mahendran, a computer vision researcher at the University of Georgia’s Institute for Artificial Intelligence, found it ironic that he had helped develop machines — including a shopping robot that could “see” stocked shelves and a kitchen robot — but nothing for people with low or no vision.

After exploring existing tech for blind and low vision people like camera-enabled canes or GPS-connected smartphone apps, he came up with a backpack-based AI design that uses cameras to provide instantaneous alerts. Read More

#image-recognition, #vision

Rick's Cafe AI 11:33 am on December 7, 2020
Tags: Image Recognition ( 313 ), Vision

Neuroscientists find a way to make object-recognition models perform better

Computer vision models known as convolutional neural networks can be trained to recognize objects nearly as accurately as humans do. However, these models have one significant flaw: Very small changes to an image, which would be nearly imperceptible to a human viewer, can trick them into making egregious errors such as classifying a cat as a tree.

A team of neuroscientists from MIT, Harvard University, and IBM have developed a way to alleviate this vulnerability, by adding to these models a new layer that is designed to mimic the earliest stage of the brain’s visual processing system. In a new study, they showed that this layer greatly improved the models’ robustness against this type of mistake. Read More

#image-recognition, #vision

Rick's Cafe AI 11:31 am on September 27, 2020
Tags: Vision

Computer Vision software for image and video identification

Computer vision often detects and locates objects in digital images and videos. As living organisms process images with their visual cortex, many researchers have taken the architecture of the mammalian visual cortex as a model for neural networks structured to perform image recognition.

Over the past 20 years, progress in computer vision has been remarkable. Read More

#vision

Rick's Cafe AI 11:15 am on September 25, 2020
Tags: Image Recognition ( 313 ), Vision

Computational Needs for Computer Vision (CV) in AI and ML Systems

Computer vision (CV) is a major task for modern Artificial Intelligence (AI) and Machine Learning (ML) systems. It’s accelerating nearly every domain in the tech industry enabling organizations to revolutionize the way machines and business systems work.

… In this article, we briefly show you the common challenges associated with a CV system when it employs modern ML algorithms. Read More

#image-recognition, #vision

Rick's Cafe AI 10:06 am on August 4, 2020
Tags: Image Recognition ( 313 ), NLP ( 486 ), Vision

Sign language recognition using deep learning

TL;DR It is presented a dual-cam first-vision translation system using convolutional neural networks. A prototype was developed to recognize 24 gestures. The vision system is composed of a head-mounted camera and a chest-mounted camera and the machine learning model is composed of two convolutional neural networks, one for each camera. Read More

#image-recognition, #nlp, #vision

Rick's Cafe AI 11:07 am on July 30, 2020
Tags: Image Recognition ( 313 ), Reinforcement Learning ( 78 ), Vision

Neuroevolution of Self-Interpretable Agents

Inattentional blindness is the psychological phenomenon that causes one to miss things in plain sight. It is a consequence of the selective attention in perception that lets us remain focused on important parts of our world without distraction from irrelevant details. Motivated by selective attention, we study the properties of artificial agents that perceive the world through the lens of a self-attention bottleneck. By constraining access to only a small fraction of the visual input, we show that their policies are directly interpretable in pixel space. We find neuroevolution ideal for training self-attention architectures for vision-based reinforcement learning (RL) tasks,allowing us to incorporate modules that can include discrete, non-differentiable operations which are useful for our agent. We argue that self-attention has similar properties as indirect encoding, in the sense that large implicit weight matrices are generated from a small number of key-query parameters, thus enabling our agent to solve challenging vision based tasks with at least 1000x fewer parameters than existing methods. Since our agent attends to only task critical visual hints, they are able to generalize to environments where task irrelevant elements are modified while conventional methods fail. Read More

#image-recognition, #reinforcement-learning, #vision

Rick's Cafe AI 11:18 am on March 6, 2020
Tags: Image Recognition ( 313 ), Vision

Faster video recognition for the smartphone era

By one estimate, training a video-recognition model can take up to 50 times more data and eight times more processing power than training an image-classification model. That’s a problem as demand for processing power to train deep learning models continues to rise exponentially and concerns about AI’s massive carbon footprint grow. Running large video-recognition models on low-power mobile devices, where many AI applications are heading, also remains a challenge.

Song Han, an assistant professor at MIT’s Department of Electrical Engineering and Computer Science (EECS), is tackling the problem by designing more efficient deep learning models. Read More

#image-recognition, #vision

Rick's Cafe AI

The latest in Artificial Intelligence carefully curated into its own special blend

Tag Archives: Vision

Introducing Segment Anything: Working toward the first foundation model for image segmentation

The Future Direction And Vision For AI

New technology gives smart cars ‘x-ray’-like vision

AI backpack concept gives audio alerts to blind pedestrians

Neuroscientists find a way to make object-recognition models perform better

Computer Vision software for image and video identification

Computational Needs for Computer Vision (CV) in AI and ML Systems

Sign language recognition using deep learning

Neuroevolution of Self-Interpretable Agents

Faster video recognition for the smartphone era