The IAP UCSD Workshop on the Future of AI and Cloud Computing Applications and Infrastructure was conducted on Friday, April 21, 2023 at UC San Diego.
This event was co-organized by Professor Hadi Esmaeilzadeh and the IAP.
This event was co-organized by Professor Hadi Esmaeilzadeh and the IAP.
Agenda - Videos of Presentations Please see Abstracts and Speaker Bios below the Agenda.
8:30-8:55 – Badge Pick-up – Coffee/Tea and Breakfast Food/Snacks
8:55-9:00 – Welcome - Prof. Hadi Esmaeilzadeh
9:00-9:30 – Dr. Doug Terry, Amazon Web Services, “Replicating Cloud Data Across Regions for Business Continuity”
9:30-10:00 – Prof. Salman Avestimehr, USC, “Collaborative Machine Learning at the Edge”
10:00-10:30 – Dr. Yunfei Ma, Alibaba, “Overcome the "Last-mile" Challenge through Wireless and Video Collaboration"
10:30-11:00 – Prof. Amy Ousterhout, UCSD, "Optimizing CPU Efficiency and Tail Latency in Datacenters"
11-11:30 – Lightning Round for Student Posters
11:30-12:30 – Lunch and Poster Viewing
12:30-1:00 – Prof. Yiying Zhang, UCSD, “An End-to-End Implementation of A “Server-less” Data Center”
1:00-1:30 – Dr. Amir Yazdanbakhsh, Google, "From Data to Success: Leveraging Machine Learning for Full Stack Optimization"
1:30-2:00 – Prof. Hung-Wei Tseng, UCR, “The upcoming revolution of general-purpose computing”
2:00-2:30 – Break
2:30-3:00 – Prof. Hyoukjun Kwon, UCI, “XRBench: An Extended Reality (XR) Machine Learning Benchmark Suite for the Metaverse”
3:00-3:30 – Prof. Hadi Esmaeilzadeh, UCSD, “Towards Architecting Machines for Empathic Conscious Intelligence”
3:30-4:00 - Best Poster Award
Abstracts and Bios
Prof. Salman Avestimehr, USC, “Collaborative Machine Learning at the Edge”
Abstract: Large-scale ML models (such as GPT-4) promise to revolutionize many products and services. So, it is not surprising that every tech enterprise (or even individuals) would now desire to have their own customized models. However, the data that is needed for customization (or fine-tuning) of such models is often spread across many silos (e.g., edge nodes, users’ devices, multi-clouds, etc.), and can’t be pooled due to privacy, security, regulations, as well as cloud costs. In this talk, I will discuss how we ease this problem at FedML by providing a decentralized (or federated) machine learning ecosystem that enables collaborative training of state-of-the-art machine learning models across the edge and cloud. I will also highlight some of the key research challenges in this area, as well as recent progress toward them.
Bio: Salman Avestimehr (https://www.avestimehr.com) is the Dean’s Professor and the inaugural director of the USC-Amazon Center on Trustworthy AI (https://trustedai.usc.edu) at the ECE and CS Department of the University of Southern California, He is also the CEO and co-founder of FedML (https://fedml.ai). His research interests include decentralized and federated machine learning, information theory, security, and privacy. Dr. Avestimehr has received many awards for his research, including the Presidential PECASE award from the White House (President Obama), the James L. Massey Research & Teaching Award from IEEE Information Theory Society, an Information Theory Society and Communication Society Joint Paper Award, and several Best Paper Awards at Conferences. He has been an Amazon Scholar in Alexa-AI, and is a fellow of the IEEE.
Prof. Hadi Esmaeilzadeh, UCSD, “Towards Architecting Machines for Empathic Conscious Intelligence”
Abstract: Recent advances in deep learning and computer architecture have achieved human-scale speed and accuracy for classification tasks. However, current machines are far from capabilities such as metathinking, creativity, and empathy. In this talk, I contend that such a paradigm shift is possible only through a fundamental change in the state of artificial intelligence toward consciousness, similar to what took place in biological systems through the process of natural selection and evolution. I propose that consciousness is an emergent phenomenon that primordially appears when two machines co-create their own language through which they can recall and communicate their internal state of time-varying symbol manipulations. Since in our view, consciousness arises from the communication of inner states, it leads to empathy that can unlock unprecedented capabilities in machines. Such a fundamental shift requires architecting machines that can create languages while we currently need massive scale-out systems to just recognize and translate already excising pre-created languages. Clearly, we need to move beyond traditional approaches of computer architecture design to devise paradigms that breaks the current abstractions and deliver unparalleled performance and energy efficiency levels from edge to cloud. To that end, I sketch out a journey that takes an algorithm-driven full-stack approach to deliver orders of magnitude gains by architecting both digital and analog programmable machines. Then, I shift the focus towards the importance of data and its protection and highlight the role of hardware multi-tenancy in both democratizing artificial intelligence and its green development. I will also discuss our ongoing efforts to innovate holistic systems for neuroscience to enable brain-implantable devices that enhance memory in wet lab settings.
Bio: Dr. Esmaeilzadeh was awarded early tenure at the University of California, San Diego (UCSD), where he is the inaugural holder of Halicioglu Chair in Computer Architecture with the rank of associate professor in Computer Science and Engineering. Prior to UCSD, he was an assistant professor in the School of Computer Science at the Georgia Institute of Technology from 2013 to 2017. There, he was the inaugural holder of the Allchin Family Early Career Professorship. Prof. Esmaeilzadeh is the co-founder and CTO of Protopia AI that is commercializing one his research projects. Gartner recognized Protopia AI in 2022 with a Cool Vendor Award which highlights vendors who are bringing unique and new technology to the market. He is also the founding director of the Alternative Computing Technologies (ACT) Lab, where his team is developing new technologies and cross-stack solutions to build the next generation computer systems. Dr. Esmaeilzadeh obtained his Ph.D. from the Department of Computer Science and Engineering at the University of Washington in 2013 where his Ph.D. work received the 2013 William Chan Memorial Best Dissertation Award. Prof. Esmaeilzadeh received the IEEE Technical Committee on Computer Architecture (TCCA) “Young Architect” Award in 2018 and was inducted to the ISCA Hall of Fame in the same year. He has received the Air Force Office of Scientific Research Young Investigator Award (2017), College of Computing Outstanding Junior Faculty Research Award (2017), Google Research Faculty Award (2018, 2016 and 2014), Qualcomm Research Award (2020, 2017, and 2016), Microsoft Research Award (2017 and 2016), and Lockheed Inspirational Young Faculty Award (2016). His teams were awarded the Qualcomm Innovation Fellowship in 2014 and 2018, one of his students was a Microsoft Research Fellow, and two of his students won the National Center for Women & IT (NCWIT) Collegiate Award in 2017 and 2020. Four of his undergraduate students have been awarded the Georgia Tech President’s Undergraduate Research Award (PURA). His research has been recognized by four Communications of the ACM Research Highlights, four IEEE Micro Top Picks, a nomination for Communications of the ACM Research Highlights, an honorable mention in IEEE Micro Top Picks, and a Distinguished Paper Award in HPCA 2016. Hadi’s work on dark silicon has also been profiled in The New York Times. More information is available on his webpage, http://cseweb.ucsd.edu/~hadi/.
Prof. Hyoukjun Kwon, UCI, “XRBench: An Extended Reality (XR) Machine Learning Benchmark Suite for the Metaverse”
Abstract: Real-time multi-task multi-model (RT-MTMM) workloads, a new form of deep learning inference workloads, are emerging for applications areas like extended reality (XR) to support metaverse use cases. Compared to standard ML applications, these ML workloads present unique difficulties and constraints because these workloads combine user interactivity in various modalities with computationally complex machine learning (ML) activities in real time. As a result, RT-MTMM workloads impose heterogeneity and concurrency requirements on future ML systems and devices, necessitating the development of new capabilities.
In this talk, I will first discuss the various characteristics of such RT-MTMM workloads in detail. Based on the characteristics, I will motivate and present XRBench, a well-defined benchmark suite based on realistic use cases of an RT-MTMM workload, to facilitate future research that addresses new challenges from RT-MTMM workloads. Using XRBench, I will present some case study results that show the new implications to the ML System design for XR use cases, which motivates future ML system research in XR and other RT-MTMM workloads. Finally, I will discuss our plans for evolving XRBench into an open project and invite everyone to the open project.
Bio: Hyoukjun Kwon is an assistant professor at UC Irvine in the EECS department. Before he joined UC Irvine, he was a research scientist at Meta (Facebook) Reality Labs until 2022, where he worked on future AR/VR ML systems. He received his PhD degree in Computer Science from the Georgia Institute of Technology in 2020.
His primary research area is computer architecture focusing on machine learning (ML) accelerator systems. In particular, he focused on a communication-centric approach to design accelerator architectures with flexible dataflows. Based on the principle he explored, he is actively working on cross-stack co-design of ML accelerator systems for future applications such as augmented and virtual realities (AR/VR). His works have been recognized in major architecture and ML conferences such as MICRO, ASPLOS, HPCA, CVPR, MLSys, and so on. In particular, MAERI (ASPLOS 2018) and MAESTRO (MICRO 2019) are recognized at IEEE Top Picks in Computer Architecture Conferences as honorable mention and the top pick in 2019 and 2020, respectively. His thesis work on the communication-centric approach to design ML accelerators was also recognized with an honorable mention at IEEE ACM SIGARCH/IEEE CS TCCA Outstanding Dissertation Award.
Dr. Yunfei Ma, Alibaba, “Overcome the "Last-mile" Challenge through Wireless and Video Collaboration”
Abstract: The COVID19 pandemic has profoundly changed the landscape of video services and redefined how we watch, what we
watch and why we watch. For example, hundreds of millions of users watch videos on Taobao to decide what to buy, collaborate with colleagues remotely via video conferencing on Dingtalk, enjoy precious leisure minutes on Youku, and receive education in streaming classroom powered by AliCloud. These
revolutions put new pressures and requirements on the network in terms of bandwidth, latency, scalability, and reliability. However, there is a big gap. The truth is, today's network is far from being the "ideal pipe", and we suffer, more or less, from the agony of unstable, slow, and failed network connectivity.
In this talk, I will discuss the lessons we learned from Alibaba video services over the past few years and the insights we gained. Finally, I will discuss how
we leverage video and wireless collaboration to close this gap with two in-production systems we designed at Alibaba: XLINK [SIGCOMM’21],
Alibaba’s multipath QUIC solution deployed in Taobao short videos, and GSO-simulcast [SIGCOMM’22], an SDN-like simulcast stream controller deployed in Alibaba’s Dingtalk.
Bio: Yunfei Ma is a researcher and Senior R&D manager at Alibaba, where he leads an incredible team that works on cutting-edge technologies in mobile and wide-area networking.Before joining Alibaba, He was a postdoctoral researcher at MIT Media Lab. He received Ph.D. in ECE from Cornell University
and B.S. from USTC. His research has been deployed in Alibaba’s core services, such as Taobao and Dingtalk, and has transformed into several
products at AliCloud. He has published more than 10 papers on SIGCOMM, MOBICOM, and NSDI, and he holds more than 15 granted patents. His works
have also been covered by a number of media outlets including BBC, The Verge, MIT Technology Review and IEEE Spectrum. He served on the TPC
of ACM CoNEXT 2018, IEEE INFOCOM 2020/2019/2018 and IEEE Globecom 2021.
Prof. Amy Ousterhout, UCSD, "Optimizing CPU Efficiency and Tail Latency in Datacenters"
Abstract: The slowing of Moore’s Law and increased concerns about the environmental impacts of computing are exerting pressure on datacenter operators to use CPUs more efficiently. However, it is difficult to improve efficiency while maintaining low tail latency for applications. Shenango is a system that achieves both of these goals simultaneously by reallocating cores between applications on the same server very quickly, every few microseconds. In this talk I will describe a sequence of works that introduced the mechanisms and policies that made Shenango possible.
Bio: Amy Ousterhout is an Assistant Professor in Computer Science and Engineering at the University of California, San Diego. The goal of her research is to improve the efficiency, performance, and usability of datacenter applications. Before joining UCSD, she was a postdoctoral researcher at UC Berkeley. She received her PhD in Computer Science from MIT and her BSE in Computer Science from Princeton. While at MIT, she was awarded an NSF Fellowship and a Hertz Foundation Fellowship.
Dr. Doug Terry, Distinguished Scientist and Head of Database Systems Lab, Amazon Web Services, “Replicating Cloud Data Across Regions for Business Continuity”
Abstract: A typical cloud application relies on one or more database services, and the application is unable to continue its operation if these services become substantially degraded. Although the complete failure of a service within a region is extremely unlikely, customers increasingly ask for, and sometimes implement their own, cross-region replication and failover mechanisms. Difficult trade-offs arise when trying to achieve high availability with low latency reads and writes while avoiding data loss in failure situations. This talk presents the approach that was taken for global tables in Amazon DynamoDB and examines the broader challenges.
Bio: As a Distinguished Scientist and Vice President for AWS, Doug is driving innovation in new technologies for relational databases, NoSQL databases, analytics, and the data lake. His passion is to evolve cloud database services to better meet the needs of customers with globally distributed
applications without compromising on the key tenets of availability, performance, and consistency. Prior to joining Amazon, Doug taught distributed systems at U. C. Berkeley and led advanced research at Xerox PARC, Microsoft, and Samsung.
Prof. Hung-Wei Tseng, UCR, “The Upcoming Revolution of General-purpose Computing”
Abstract: The significance of artificial intelligence (AI) and machine learning (ML) applications has changed the landscape of computer systems: AI accelerators start to emerge in a wide range of devices, from mobile phones to data center servers. In addition to the direct contribution of performance gain in AI/ML workloads, the introduction of AI/ML accelerators bring a new flavor of computation model, matrix processing model, that any matrix-based algorithm can leverage in theory. However, the highly application-specific designs of these accelerators place hurdles for a wider spectrum of workloads. In this talk, Hung-Wei will discuss state-of-the-art AI/ML accelerators. By transforming existing algorithms to AI/ML-specific functions, Hung-Wei’s research group has demonstrated that we can already achieve 2.5x speedup for linear algebra based kernels using edge TPUs and up to 288x speedup for database join operations through using NVIDIA’s tensor cores. compared with modern CPUs. If we can extend the design of AI/ML accelerators to support more matrix operations, a set of matrix applications, including dynamic programming based algorithms, can achieve more than 10x speedup over conventional GPUs. Finally, Hung-Wei will discuss some of the potential extensions that are essential to make the upcoming revolution of general-purpose computing successful.
Bio: Hung-Wei is currently an associate professor in the Department of Electrical and Computer Engineering at the University of California, Riverside. He is now leading the Extreme Storage & Computer Architecture Laboratory and focusing on accelerating applications through generalized computing on tensor processors, AI/ML accelerators as well as intelligent data storage systems. He is recognized by Facebook faculty research award and IEEE Micro "Top Picks from Computer Architecture" in 2020 for his research in accelerating data-intensive applications through revisiting the storage system design. He got his PhD from the Department of Computer Science and Engineering at the University of California, San Diego.
Dr. Amir Yazdanbakhsh, Google, "From Data to Success: Leveraging Machine Learning for Full Stack Optimization"
Abstract: Custom accelerators, such as Google TPUs and Edge TPUs, have been key to the recent machine learning (ML) advancements, significantly increasing available compute power and unlocking capabilities such as AlphaGo, RankBrain, WaveNets, and Conversational Agents. To sustain these advances, the computing ecosystem must continue to innovate across the stack and acclimate to rapidly evolving ML models and applications. This talk broadly covers how to leverage state-of-the-art ML techniques to curtail the waning of Moore's Law. I first discuss the benefits of data-driven optimization methods in designing efficient hardware accelerators. Then, I explain some of our recent work in using large language models to automate algorithmic optimization at the top of the computing stack and increase the productivity of software development. I wrap up the talk by presenting some future directions for facilitating the adoption of machine learning methods into full stack optimization.
Bio: Amir received his Ph.D. degree in computer science from the Georgia Institute of Technology with Prof. Hadi Esmaeilzadeh on neuro-general and approximate computing. His Ph.D. work has been recognized by various awards, including Microsoft PhD Fellowship and Qualcomm Innovation Fellowship.
Amir is a Research Scientist at Google Research, Brain Team and worked at the intersection of computer architecture, systems, and machine learning. Amir is the co-founder and co-lead of the Machine Learning for Computer Architecture team at Google Brain where they leverage the recent machine learning methods and advancements to innovate and design better hardware accelerators. The work from their team has been covered by media outlets including WIRED and ZDNet. He was inducted into the ISCA Hall of Fame in 2023.
Prof. Yiying Zhang, UCSD, “An End-to-End Implementation of A “Server-less” Data Center”
Abstract: For decades, the unit of deployment, operation, and failure in datacenters has been a monolithic server, one that contains all the hardware resources that are needed to run a user program. This server-based architecture has several limitations including inefficient resource utilization and complicated resource management. To mitigate these limitations, researchers have explored a new data-center architecture of breaking monolithic servers into disaggregated hardware units that are organized as resource pools.
Disaggregation research so far has mainly taken a virtual or emulated approach, where disaggregated resources are implemented with regular servers. Is it feasible and beneficial to build real disaggregated data centers? With new hardware trends that make hardware devices "smarter" and data-center networks faster as well as new application trends that break a program into smaller pieces, we believe that it is time to build a truly disaggregated, or "server-less", data center. In this talk, I will go over our endeavor in the past few years in building an end-to-end server-less data center, including 1) a real hardware implementation of disaggregated devices [Clio, ASPLOS'22], 2) an operating system for managing disaggregated hardware devices [LegoOS, OSDI'18], 3) a new network system for connecting disaggregated devices and performing smart network processing [SuperNIC, arXiv:2109.07744], and 4) a new cloud system for executing applications in a serverless way on disaggregated hardware [Scad, arXiv:2206.13444].
Bio: Yiying Zhang is an associate professor in the Computer Science and Engineering Department at University of California, San Diego. Her research interests span operating systems, distributed systems, computer architecture, data-center networking, and Systems-ML. Her group builds large-scale, cross-layer real systems for next-generation data centers and clouds. She has won an OSDI best paper award, a SYSTOR best paper award, an NSF CAREER award, and various research awards from the industry including Google, Meta, VMware, Cisco, and Amazon. Yiying received her Ph.D. from the Department of Computer Sciences at the University of Wisconsin-Madison and worked as an assistant professor at Purdue University before joining UCSD.
QUOTES FROM PREVIOUS WORKSHOPS
Professor David Patterson, the Pardee Professor of Computer Science, UC Berkeley, “I saw strong participation at the Cloud Workshop, with some high energy and enthusiasm; and I was delighted to see industry engineers bring and describe actual hardware, representing some of the newest innovations in the data center.”
Professor Christos Kozyrakis, Professor of Electrical Engineering & Computer Science, Stanford University, “As a starting point, I think of these IAP workshops as ‘Hot Chips meets ISCA’, i.e., an intersection of industry’s newest solutions in hardware (Hot Chips) with academic research in computer architecture (ISCA); but more so, these workshops additionally cover new subsystems and applications, and in a smaller venue where it is easy to discuss ideas and cross-cutting approaches with colleagues.”
Professor Hakim Weatherspoon, Professor of Computer Science, Cornell University, “I have participated in three IAP Workshops since the first one at Cornell in 2013 and it is great to see that the IAP premise was a success now as it was then, bringing together industry and academia in a focused workshop and an all-day exchange of ideas. It was a fantastic experience and I look forward to the next IAP Workshop.”
Professor Ken Birman, the N. Rama Rao Professor of Computer Science, Cornell University, “I actually thought it was a fantastic workshop, an unquestionable success, starting from the dinner the night before, through the workshop itself, to the post-event reception for the student Best Poster Awards.”
Dr. Carole-Jean Wu, Research Scientist, AI Infrastructure, Facebook Research, and Professor of CSE, Arizona State University, “The IAP Cloud Computing workshop provides a great channel for valuable interactions between faculty/students and the industry participants. I truly enjoyed the venue learning about research problems and solutions that are of great interest to Facebook, as well as the new enabling technologies from the industry representatives. The smaller venue and the poster session fostered an interactive environment for in-depth discussions on the proposed research and approaches and sparked new collaborative opportunities. Thank you for organizing this wonderful event! It was very well run.”
Nathan Pemberton, PhD student, UC Berkeley, "IAP workshops provide a valuable chance to explore emerging research topics with a focused group of participants, and without all the time/effort of a full-scale conference. Instead of rushing from talk to talk, you can slow down and dive deep into a few topics with experts in the field."
Dr. Pankaj Mehra, Samsung, "Terrifically organized Workshops that give all parties -- students, faculty, industry -- valuable insights to take back"
Dr. Richard New, VP Research, Western Digital, “IAP workshops provide a great opportunity to meet with professors and students working at the cutting edge of their fields. It was a pleasure to attend the event – lots of very interesting presentations and posters.”
Professor Vishal Shrivastav, Purdue University, “Attending the IAP workshops as a PhD student at Cornell was a great experience and very rewarding. I really enjoyed the many amazing talks from both the industry and academia. My personal conversations with several industry leaders at the workshop will definitely guide some of my future research."
Professor Ana Klimovic, ETH Zurich, “I attended three IAP workshops as a PhD student at Stanford, and I am consistently impressed by the quality of the talks and the breadth of the topics covered. These workshops bring top-tier industry and academia together to discuss cutting-edge research challenges. It is a great opportunity to exchange ideas and get inspiration for new research opportunities."