IAP Webinar Series: Topics in AI

Join our 2020-2021 series of webinars featuring topics in AI.


September 29, 2020, 11am-12pm PT
Prof. Song Han, MIT Department of Electrical Engineering and Computer Science


“Once-for-All” DNNs: Simplifying Design of Efficient Models for Diverse Hardware

Pre-registration is required.


Prof. David Brooks, Harvard University
Abstract: We address the challenging problem of designing deep neural networks that can execute efficiently across a diverse range of hardware platforms, especially in edge devices. Conventional approaches rely on manual design or use automated neural architecture search (NAS) to find a specialized neural network and train it from scratch for each use case, which is computation-ally prohibitive. Last June, researchers released a startling report estimating that using NAS to create a single model resulted in emission of roughly 626k pounds of carbon dioxide. That’s equivalent to nearly five times the lifetime emissions of the average U.S. car, including its manufacturing. I will present a new NAS system for searching and running neural networks efficiently, a once-for-all network (OFA).


By decoupling model training and architecture search, OFA can reduce the carbon emissions resulting from neural architecture search by thousands of times. OFA can produce a surprisingly large number of sub-networks (> 10^19) that can fit different hardware platforms and latency constraints, from cloud GPUs to micro controllers. By exploiting weight sharing and progressive shrinking, the produced model consistently outperforms state-of-the-art NAS methods including MobileNet-v3 and EfficientNet (up to 4.0% ImageNet top1 accuracy improvement over MobileNetV3, or same accuracy but 1.5x faster than MobileNetV3, 2.6x faster than EfficientNet). In particular, OFA achieves a new state-of-the-art 80.0% ImageNet top-1 accuracy under the mobile setting (<600M MACs). OFA was the winning solution for the 3rdand 4thIEEE Low Power Computer Vision Challenge (LPCVC). OFA has also been applied to efficient video recognition and 3D point cloud.


Bio: Song Han is an assistant professor in MIT’s Department of Electrical Engineering and Computer Science. He received his PhD degree from Stanford University and bachelor’s degree from Tsinghua University. His research focuses on efficient deep learning computing. He proposed “deep compression” technique that can reduce neural network size by an order of magnitude without losing accuracy, and the hardware implementation “efficient inference engine” that first exploited pruning and weight sparsity in deep learning accelerators. His recent work on hardware-aware neural architecture search was highlighted by MIT News, Qualcomm News, VentureBeat, IEEE Spectrum, integrated in PyTorch and AutoGluon, and received many low-power computer vision contest awards in flagship AI conferences (CVPR’19, ICCV’19 and NeurIPS’19).


Song received Best Paper awards at ICLR’16 and FPGA’17, Amazon Machine Learning Research Award, SONY Faculty Award, Facebook Faculty Award. Song was named “35 Innovators Under 35” by MIT Technology Review for his contribution on “deep compression” technique that “lets powerful artificial intelligence (AI) programs run more efficiently on low-power mobile devices.” Song received the NSF CAREER Award for “efficient algorithms and hardware for accelerated machine learning”. At right, Song Receives the Best Poster Award at the Stanford Cloud Workshop in 2016 .