This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm.
Even though most of the machine learning conferences this year have become virtual due to COVID-19, the AI community continues to drive the industry forward with novel work that will ultimately help people and enhance lives. The quality of papers continues to be high since the acceptance criteria has remained just as strict or has become even stricter than in previous years. To give an example, the paper acceptance rate of Computer Vision and Pattern Recognition (CVPR) has gone from 26% in 2019 to 22% this year. This is also due to more research labs and companies publishing work in the field of artificial intelligence. We’re very happy to have four CVPR workshop papers and two papers in the main conference, including one oral presentation. It’s an exciting field to watch.
In this blog post, we’ll focus on state-of-the-art research that has been undertaken at Qualcomm Technologies in the first half of 2020 and accepted at major conferences in the industry. The research has broad implications in a variety of domains from power efficiency and compression to autonomous driving and more.
Leading research and development across the entire spectrum of AI.
Personalization and continuous learning
On the topic of personalization, the paper “Conditional Channel Gated Networks for Task-Aware Continual Learning,” which was accepted at CVPR (oral), tackles the issue of neural networks’ “forgetfulness” of previous tasks as they meet the objective of the current training examples. An example of this would be a network incrementally learning to recognize additional bird species or learning an entirely different task such as flower recognition, without forgetting the tasks it learned before.
The paper addresses the issue with a new type of conditional computation. Our method yields up to 24% improvement in accuracy compared to competing methods on a large set of labeled images.
Efficient learning is an important line of research for Qualcomm AI Research. Our paper “A Data and Compute Efficient Design for Limited Resources Deep Learning,” which was accepted at an International Conference on Learning Representations (ICLR) workshop, brings together previous work done in collaboration with the University of Amsterdam on the topic of equivariance, as well as our quantization breakthroughs. The paper shows the synergy between those two methods. The algorithm recognizes rotating images from very few examples, opening up opportunities for applications in the field of medical diagnosis.
Power efficiency is crucial for scaling AI across devices and a main focus for our research. The paper “LSQ+: Improving low-bit quantization through learnable offsets and better initialization,” which was accepted at a CVPR workshop, deals with the issue of loss in performance when all negative activations are quantized to zero. It proposes a general asymmetric quantization scheme that can learn to accommodate the negative activations, leading to significant performance improvements. For example, we saw a 1.8% gain with 4-bit quantization and up to 5.6% gain with 2-bit quantization on EfficientNet-B0 with a large dataset of images.
“Gradient l1 Regularization for Quantization Robustness,” which was accepted at ICLR, also builds on Qualcomm Technologies’ ongoing work in model quantization. The regularization-based method paves the way for “on-the-fly” post-training quantization to various bit-widths.
“Batch-Shaping for Learning Conditional Channel Gated Networks,” which was accepted at ICLR as an oral paper, presents a method that trains large capacity neural networks with significantly improved accuracy and lower dynamic computational cost. More specifically, on ImageNet, our ResNet50 and ResNet34 gated networks obtain 74.60% and 72.55% top-1 accuracy compared to the 69.76% accuracy of the baseline ResNet18 model, for similar complexity. We also show that the resulting networks automatically learn to use more features for difficult examples and fewer features for simple examples.
“Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge Devices,” which was accepted at the Conference on Systems and Machine Learning (MLSys), presents a compiler that successfully minimizes memory footprint. Compared to TensorFlow Lite (widely used as a standard in the industry), the memory-aware scheduling leads to 1.86x reduction in memory footprint and 1.76x reduction in off-chip traffic, reducing power significantly. In addition, engineering resources can be significantly saved since manual work that previously took an engineer two days to complete can now be done automatically in under a minute (compilation time).
Data and source compression
Data and source compression technologies are widely used for a variety of use cases to save bandwidth and reduce file size. Using AI rather than traditional compression techniques has shown great promise. However, manual tuning is typically required for a neural network model to perform a realistic image compression task. The paper “Lossy Compression with Distortion Constrained Optimization,” which was accepted at a CVPR workshop , applies a method called “constrained optimization” as an alternative that is both more practical and outperforms the traditional tuning method.
“Adversarial Distortion for Learned Video Compression,” which was accepted at a CVPR workshop, aims to optimize video compression with an adversarial method (algorithms competing against each other). This brings about a reduction of perceptual artifacts and reconstruction of detail previously lost.
Speech compression is another challenge that we tackle in our latest research. “Feedback Recurrent Autoencoder,” which was accepted at the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), proposes a new architecture for online compression of sequential data with temporal dependency. This produces high-quality speech waveforms at a low, fixed bit rate, achieving up to 2.5 times rate reduction, compared to the popular open source codec called Opus.
Along with the QUVA lab (the lab Qualcomm Technologies co-founded with the University of Amsterdam), we’ve been conducting research in the field of video understanding and deep vision for the past five years. The paper “ActionBytes: Learning from Trimmed Videos to Localize Actions,” which was accepted at CVPR, tackles the problem of localizing actions in long untrimmed videos by segmenting videos into interpretable fragments called “ActionBytes.” This method achieves state-of-the-art results on long videos.
Small changes in system architecture can go a long way in improving model performance, as the following paper shows. In the paper “End-to-End Lane Marker Detection via Row-wise Classification,” which was accepted at a CVPR workshop, we propose a state-of-the-art lane marker detection method that is set to improve the way that autonomous cars “see” and make decisions. The paper proposes a system architecture (i.e., neural network architecture) that directly outputs the lane marker vertices, which can be used in lane detection.
We’re happy to see the amount of interesting work being done and the resilience shown by conference organizers in continuing to showcase valuable AI research. In a future blog post, we will discuss papers accepted at upcoming AI conferences.