The vast majority of processors in the world are actually microcontroller units (MCUs), which find widespread use performing simple control tasks in applications ranging from automobiles to medical devices and office equipment. The Internet of Things (IoT) promises to inject machine learning into many of these every-day objects via tiny, cheap MCUs. However, these resource-impoverished hardware platforms severely limit the complexity of machine learning models that can be deployed. For example, although convolutional neural networks (CNNs) achieve state-of-theart results on many visual recognition tasks, CNN inference on MCUs is challenging due to severe finite memory limitations. To circumvent the memory challenge associated with CNNs, various alternatives have been proposed that do fit within the memory budget of an MCU, albeit at the cost of prediction accuracy. This paper challenges the idea that CNNs are not suitable for deployment on MCUs. We demonstrate that it is possible to automatically design CNNs which generalize well, while also being small enough to fit onto memory-limited MCUs. Our Sparse Architecture Search method combines neural architecture search with pruning in a single, unified approach, which learns superior models on four popular IoT datasets. The CNNs we find are more accurate and up to 4.35× smaller than previous approaches, while meeting the strict MCU working memory constraint. Read More
Monthly Archives: June 2019
EfficientNet: Improving Accuracy and Efficiency through AutoML and Model Scaling
Convolutional neural networks (CNNs) are commonly developed at a fixed resource cost, and then scaled up in order to achieve better accuracy when more resources are made available. For example, ResNet can be scaled up from ResNet-18 to ResNet-200 by increasing the number of layers, and recently, GPipe achieved 84.3% ImageNet top-1 accuracy by scaling up a baseline CNN by a factor of four. The conventional practice for model scaling is to arbitrarily increase the CNN depth or width, or to use larger input image resolution for training and evaluation. While these methods do improve accuracy, they usually require tedious manual tuning, and still often yield suboptimal performance. What if, instead, we could find a more principled method to scale up a CNN to obtain better accuracy and efficiency?
In our ICML 2019 paper, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks”, we propose a novel model scaling method that uses a simple yet highly effective compound coefficient to scale up CNNs in a more structured manner. Unlike conventional approaches that arbitrarily scale network dimensions, such as width, depth and resolution, our method uniformly scales each dimension with a fixed set of scaling coefficients. Powered by this novel scaling method and recent progress on AutoML, we have developed a family of models, called EfficientNets, which superpass state-of-the-art accuracy with up to 10x better efficiency (smaller and faster). Read More
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available. In this paper, we systematically study model scaling and identify that carefully balancing network depth, width, and resolution can lead to better performance. Based on this observation, we propose a new scaling method that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient. We demonstrate the effectiveness of this method on scaling up MobileNets and ResNet.
To go even further, we use neural architecture search to design a new baseline network and scale it up to obtain a family of models, called EfficientNets, which achieve much better accuracy and efficiency than previous ConvNets. In particular, our EfficientNet-B7 achieves stateof-the-art 84.4% top-1 / 97.1% top-5 accuracy on ImageNet, while being 8.4x smaller and 6.1x faster on inference than the best existing ConvNet. Our EfficientNets also transfer well and achieve state-of-the-art accuracy on CIFAR-100 (91.7%), Flowers (98.8%), and 3 other transfer learning datasets, with an order of magnitude fewer parameters. Read More
Why does Beijing suddenly care about AI ethics?
Did China and the US just agree on something?
This week, Chinese scientists and engineers released a code of ethics for artificial intelligence that might signal a willingness from Beijing to rethink how it uses the technology.
And while China’s government is widely criticized for using AI as a way to monitor citizens, the newly published guidelines seem remarkably similar to ethical frameworks laid out by Western companies and governments.
The Beijing AI Principles were announced last Saturday by the Beijing Academy of Artificial Intelligence (BAAI), an organization backed by the Chinese Ministry of Science and Technology and the Beijing municipal government. They spell out guiding principles for research and development in AI, including that “human privacy, dignity, freedom, autonomy, and rights should be sufficiently respected.” Read More
An Explicitly Relational Neural Network Architecture
With a view to bridging the gap between deep learning and symbolic AI, we present a novel end-to-end neural network architecture that learns to form propositional representations with an explicitly relational structure from raw pixel data. In order to evaluate and analyse the architecture, we introduce a family of simple visual relational reasoning tasks of varying complexity. We show that the proposed architecture, when pretrained on a curriculum of such tasks, learns to generate reusable representations that better facilitate subsequent learning on previously unseen tasks when compared to a number of baseline architectures. The workings of a successfully trained model are visualised to shed some light on how the architecture functions. Read More
Jeff Kofman on how AI can empower newsrooms
The voice economy has found its way into newsrooms, changing the workflow of journalists and liberating time to focus on their reporting. GEN talked to Jeff Kofman, Trint CEO and an Emmy award-winning correspondent, about the challenges he faced when launching Trint, how startups can compete in a market dominated by large tech giants and how A.I. in general can empower newsrooms. Read More
DeepCount: Crowd Counting with WiFi via Deep Learning
Recently, the research of wireless sensing has achieved more intelligent results, and the intelligent sensing of human location and activity can be realized by means of WiFi devices. However, most of the current human environment perception work is limited to a single person’s environment, because the environment in which multiple people exist is more complicated than the environment in which a single person exists. In order to solve the problem of human behavior perception in a multi-human environment, we first proposed a solution to achieve crowd counting (inferred population) using deep learning in a closed environment with WIFI signals – DeepCount, which is the first in a multi-human environment. step. Since the use of WiFi to directly count the crowd is too complicated, we use deep learning to solve this problem, use Convolutional Neural Network(CNN) to automatically extract the relationship between the number of people and the channel, and use Long Short Term Memory(LSTM) to resolve the dependencies of number of people and Channel State Information(CSI) . To overcome the massive labelled data required by deep learning method, we add an online learning mechanism to determine whether or not someone is entering/leaving the room by activity recognition model, so as to correct the deep learning model in the fine-tune stage, which, in turn, reduces the required training data and make our method evolving over time. The system of DeepCount is performed and evaluated on the commercial WiFi devices. By massive training samples, our end-to-end learning approach can achieve an average of 86.4% prediction accuracy in an environment of up to 5 people. Meanwhile, by the amendment mechanism of the activity recognition model to judge door switch to get the variance of crowd to amend deep learning predicted results, the accuracy is up to 90%. Read More
Cameras That Can See Through Walls!
This AI use echolocation to identify what you’re doing
GUO XINHUA WANTS to teach computers to echolocate. He and his colleagues have built a device, about the size of a thin laptop, that emits sound at frequencies 10 times higher than the shrillest note a piccolo can sustain. The pitches it produces are inaudible to the human ear. When Guo’s team aims the device at a person and fires an ultrasonic pitch, the gadget listens for the echo using its hundreds of embedded microphones. Then, employing artificial intelligencetechniques, his team tries to decipher what the person is doing from the reflected sound alone.
The technology is still in its infancy, but they’ve achieved some promising initial results. Based at the Wuhan University of Technology, in China, Guo’s team has tested its microphone array on four different college students and found that they can identify whether the person is sitting, standing, walking, or falling, with complete accuracy, they report in a paper published today in Applied Physics Letters. While they still need to test that the technique works on more people, and that it can identify a broader range of behaviors, this demonstration hints at a new technology for surveilling human behavior. Read More
A single feature for human activity recognition using two-dimensional acoustic array
Human activity recognition is widely used in many fields, such as the monitoring of smart homes, fire detecting and rescuing, hospital patient management, etc. Acoustic waves are an effective method for human activity recognition. In traditional ways, one or a few ultrasonic sensors are used to receive signals, which require many feature quantities of extraction from the received data to improve recognition accuracy. In this study, we propose an approach for human activity recognition based on a two-dimensional acoustic array and convolutional neural networks. A single feature quantity is utilized to characterize the sound of human activities and identify those activities. The results show that the total accuracy of the activities is 97.5% for time-domain data and 100% for frequency-domain data. The influence of the array size on recognition accuracy is discussed, and the accuracy of the proposed approach is compared with traditional recognition approaches such as k-nearest neighbor and support vector machines where it outperformed them. Read More