• Home
  • Workshops
  • Webinars
  • News
  • Projects
  • Videos
  • Contact
  • MIT/UW 2021
  • UW/UCB/NYU 2020
  • UT 2019
  • Cornell 2019
  • MIT 2018
  • Stanford/UCSC 2018
  • Stanford 2016
  • CMU 2016

Webinars

​
Join our 2020-2021 series of webinars featuring topics in AI.
Thursday, May 13, 2021, 11am-12pm PT
Prof. Manya Ghobadi, MIT
Picture
Optimizing AI Systems with Optical Technologies

Pre-registration is required. Please register here. 

Abstract: Our society is rapidly becoming reliant on deep neural networks (DNNs). New datasets and models are invented frequently, increasing the memory and computational requirements for training. The explosive growth has created an urgent demand for efficient distributed DNN training systems. In this talk, I will discuss the challenges and opportunities for building next-generation DNN training clusters. In particular, I will propose optical network interconnects as a key enabler for building high-bandwidth ML training clusters with strong scaling properties. Our design enables accelerating the training time of popular DNN models using reconfigurable topologies by partitioning the training job across GPUs with hybrid data and model parallelism while ensuring the communication pattern can be supported efficiently on an optical interconnect. Our results show that compared to similar-cost interconnects, we can improve the training iteration time by up to 5x.

Bio: Manya Ghobadi is an assistant professor at the EECS department at MIT. Before MIT, she was a researcher at Microsoft Research and a software engineer at Google Platforms. Manya is a computer systems researcher with a networking focus and has worked on a broad set of topics, including data center networking, optical networks, transport protocols, and network measurement. Her work has won the best dataset award and best paper award at the ACM Internet Measurement Conference (IMC) as well as Google research excellent paper award.

Thursday, January 28, 2021, 11am-12pm PT
Prof. Christina Delimitrou, Cornell
Picture
Leveraging ML to Handle the Increasing Complexity of the Cloud Webinar Video

​Christina has received numerous awards for her research at Stanford and Cornell, most recently the 2020 TCCA Young Computer Architect Award. 

​Abstract: 
Cloud services are increasingly adopting new programming models, such as microservices and serverless compute. While these frameworks offer several advantages, such as better modularity, ease of maintenance and deployment, they also introduce new hardware and software challenges.  

​In this talk, I will briefly discuss the challenges that these new cloud models introduce in hardware and software, and present some of of our work on employing ML to improve the cloud’s performance predictability and resource efficiency. I will first discuss Seer, a performance debugging system that identifies root causes of unpredictable performance in multi-tier interactive microservices, and Sage, which improves on Seer by taking a completely unsupervised learning approach to data-driven performance debugging, making it both practical and scalable. 

Bio: Christina Delimitrou is an Assistant Professor and the John and Norma Balen Sesquicentennial Faculty Fellow at Cornell University, where she works on computer architecture and computer systems. She specifically focuses on improving the performance predictability and resource efficiency of large-scale cloud infrastructures by revisiting the way these systems are designed and managed. Christina is the recipient of the 2020 TCCA Young Computer Architect Award, an Intel Rising Star Award, a Microsoft Research Faculty Fellowship, an NSF CAREER Award, a Sloan Research Scholarship, two Google Research Award, and a Facebook Faculty Research Award. Her work has also received 4 IEEE Micro Top Picks awards and several best paper awards. Before joining Cornell, Christina received her PhD from Stanford University. She had previously earned an MS also from Stanford, and a diploma in Electrical and Computer Engineering from the National Technical University of Athens. More information can be found at: http://www.csl.cornell.edu/~delimitrou/

Below, Christina presents at the 2018 MIT Cloud Workshop.

Picture

Tuesday, September 29, 2020, 11am-12pm PT
Prof. Song Han, MIT Department of Electrical Engineering and Computer Science

Picture
“Once-for-All” DNNs: Simplifying Design of Efficient Models for Diverse Hardware Webinar Video

​Abstract: We address the challenging problem of designing deep neural networks that can execute efficiently across a diverse range of hardware platforms, especially in edge devices. Conventional approaches rely on manual design or use automated neural architecture search (NAS) to find a specialized neural network and train it from scratch for each use case, which is computation-ally prohibitive. Last June, researchers released a startling report estimating that using NAS to create a single model resulted in emission of roughly 626k pounds of carbon dioxide. That’s equivalent to nearly five times the lifetime emissions of the average U.S. car, including its manufacturing. I will present a new NAS system for searching and running neural networks efficiently, a once-for-all network (OFA).

By decoupling model training and architecture search, OFA can reduce the carbon emissions resulting from neural architecture search by thousands of times. OFA can produce a surprisingly large number of sub-networks (> 10^19) that can fit different hardware platforms and latency constraints, from cloud GPUs to micro controllers. By exploiting weight sharing and progressive shrinking, the produced model consistently outperforms state-of-the-art NAS methods including MobileNet-v3 and EfficientNet (up to 4.0% ImageNet top1 accuracy improvement over MobileNetV3, or same accuracy but 1.5x faster than MobileNetV3, 2.6x faster than EfficientNet). In particular, OFA achieves a new state-of-the-art 80.0% ImageNet top-1 accuracy under the mobile setting (<600M MACs). OFA was the winning solution for the 3rdand 4thIEEE Low Power Computer Vision Challenge (LPCVC). OFA has also been applied to efficient video recognition and 3D point cloud. 

Bio: Song Han is an assistant professor in MIT’s Department of Electrical Engineering and Computer Science. He received his PhD degree from Stanford University and bachelor’s degree from Tsinghua University. His research focuses on efficient deep learning computing. He proposed “deep compression” technique that can reduce neural network size by an order of magnitude without losing accuracy, and the hardware implementation “efficient inference engine” that first exploited pruning and weight sparsity in deep learning accelerators. His recent work on hardware-aware neural architecture search was highlighted by MIT News, Qualcomm News, VentureBeat, IEEE Spectrum, integrated in PyTorch and AutoGluon, and received many low-power computer vision contest awards in flagship AI conferences (CVPR’19, ICCV’19 and NeurIPS’19). 

Song received Best Paper awards at ICLR’16 and FPGA’17, Amazon Machine Learning Research Award, SONY Faculty Award, Facebook Faculty Award. Song was named “35 Innovators Under 35” by MIT Technology Review for his contribution on “deep compression” technique that “lets powerful artificial intelligence (AI) programs run more efficiently on low-power mobile devices.” Song received the NSF CAREER Award for “efficient algorithms and hardware for accelerated machine learning”. Below, Song Receives the Best Poster Award at the Stanford Cloud Workshop in 2016.

Picture
Thursday, March 25, 2021, 11am-12pm PT
Prof. Ana Klimovic, ETH Zurich
Picture
Ingesting and Processing Data Efficiently for Machine Learning

Pre-registration is required. Please register here. 


Abstract:  Machine learning applications have sparked the development of specialized software frameworksand hardware accelerators. Yet, in today’s machine learning ecosystem, one important part of the system stack has received far less attention and specialization for ML: how we store and preprocess training data. This talk will describe the key challenges for implementing high-performance ML input data processing pipelines. We analyze millions of ML jobs running in Google's fleet and find that input pipeline performance significantly impacts end-to-end training performance and resource consumption. Our study shows that ingesting and preprocessing data on-the-fly during training consumes 30% of end-to-end training time, on average. Our characterization of input data pipelines motivates several systems research directions, such as disaggregating input data processing from model training and caching commonly reoccurring input data computation subgraphs. We present the multi-tenant input data processing service that we are building at ETH Zurich, in collaboration with Google,  to improve ML training performance and resource usage.

​Bio: Ana Klimovic is an Assistant Professor in the Systems Group of the Computer Science Department at ETH Zurich. Her research interests span operating systems, computer architecture, and their intersection with machine learning. Ana's work focuses on computer system design for large-scale applications such as cloud computing services, data analytics, and machine learning. Before joining ETH in August 2020, Ana was a Research Scientist at Google Brain and completed her Ph.D. in Electrical Engineering at Stanford University in 2019. Her dissertation research was on the design and implementation of fast, elastic storage for cloud computing.

Below, Ana receives the Best Poster Award at the 2018 Stanford-UCSC Workshop.

Picture

Thursday, November 19, 2020, 11am-12pm PT
 Prof. Carole-Jean Wu, Arizona State and Facebook AI Research

Picture
Deep Learning: It’s Not All About Recognizing Cats and Dogs 
Webinar Video  

​Abstract: In this webinar, I will talk about the underinvested deep learning personalization and recommendation systems in the overall research community. The training of state-of-the-art industry-scale personalization and recommendation models consumes the highest number of compute cycles among all deep learning use cases. For AI inference, personalization and recommendation consumes even higher compute cycles of 80%. What does state-of-the-art industry-scale neural personalization and recommendation models look like? I will present advancement on the development of deep learning recommender systems, the implications on system and architectural design and parallelism opportunities across the machine learning system stack over a variety of compute platforms. I will conclude with future directions on multi-scale system design and optimization. 
 
Bio: Carole-Jean Wu is a Research Scientist at Facebook AI Research. Her research focus lies in the domain of computer system architecture with particular emphasis on energy- and memory-efficient systems. Her recent research has pivoted into designing systems for machine learning execution at-scale, such as for personalized recommender systems and mobile deployment. Carole-Jean chairs the MLPerf Recommendation Benchmark Advisory Board and co-chairs MLPerf Inference. Carole-Jean holds tenure as an Associate Professor at ASU. She received her M.A. and Ph.D. from Princeton and B.Sc. from Cornell. She is the recipient of the NSF CAREER Award, Facebook AI Infrastructure Mentorship Award, the IEEE Young Engineer of the Year Award, the Science Foundation Arizona Bisgrove Early Career Scholarship, and the Intel PhD Fellowship, among a number of Best Paper awards. She is a senior member of both ACM and IEEE.  
 
Below, Carole-Jean presents “Machine Learning at Scale” at the Cornell Cloud Workshop in 2019.

Picture
Proudly powered by Weebly
  • Home
  • Workshops
  • Webinars
  • News
  • Projects
  • Videos
  • Contact
  • MIT/UW 2021
  • UW/UCB/NYU 2020
  • UT 2019
  • Cornell 2019
  • MIT 2018
  • Stanford/UCSC 2018
  • Stanford 2016
  • CMU 2016