Cornell 2019

The IAP Cornell Workshop on the Future of Cloud Computing was organized by Prof. Jose F. Martínez and conducted on Friday, May 3, 2019 in Upson Hall on the Cornell campus in Ithaca, NY.

Agenda - Videos of Presentations

8:00-8:30AM Pick-up – Coffee/Tea and Breakfast Food/Snacks

8:30-9:00AM     Prof. Jose Martinez, Cornell, ECE, “PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services”

9:00-9:30AM Dr. Yang Seok Ki, Samsung, Sr. Director and Architect of Memory Solutions Lab, “Opportunities and Challenges in Computational Storage”

9:30-10:00AM Prof. Ken Birman, Cornell, Computer Science, “Using Derecho to Build Smart and Responsive Cloud Services for IoT Applications”

10:00-10:30AM  Dr. Richard New, Western Digital, VP Engineering, “The Changing Landscape of Data – New Storage Device Technologies, Interface Models, and Storage Architectures”

10:30-11:00AM  Lightning Round of Student Posters

11:00-12:30PM   Lunch and Poster Viewing

12:30-12:50PM  Prof. Hakim Weatherspoon, Cornell, Computer Science, “The Edge Supercloud: Blockchains for the Edge”

12:50-1:10PM Ed McLellan, Marvell, Distinguished Engineer, “Computing on the Mobile Edge”

1:10-1:30PM   Prof. Robbert van Renesse, Cornell, Computer Science, “X-Containers: Breaking Down Barriers to Improve Performance and Isolation of Cloud Native Containers”

1:30-2:00PM   H. K. Verma, Distinguished Engineer, Xilinx, “Database Acceleration using FPGAs”

2:00-2:20PM   Prof. Kevin Tang, Cornell, ECE, “Towards Autonomous Fine-Grained Network Management

2:20-3:00PM Break - Refreshments and Poster Viewing

3:00-3:20PM Prof. Christina Delimitrou, Cornell, ECE, “Leveraging Machine Learning to Improve Performance Predictability in Cloud Microservices”

3:20-3:40PM Dr. Jian Li, Huawei, Sr. Director of Research and Planning,“Case Studies of ICT Platform and Its Applications”

3:40-4:00PM Prof. Chris De Sa, Cornell, Computer Science, “Distributed Learning with Compressed Communication”

4:00-4:20PM Dr. Carole-Jean Wu, Facebook, “Machine Learning at Scale”

4:20-5:00PM Reception – Refreshments and Poster Awards

Presenter Abstracts and Bios
Ken Birman, Cornell, Using Derecho to Build Smart and Responsive Cloud Services for IoT Applications
Abstract: The Derecho platform was created to support a new generation of Internet-of-Things (IoT) applications with online machine-learning components. At cloud-scale, such applications require a new edge u-service ecosystem, which I like to think of a as a form of “smart memory”. I’m using this term to refer to a customizable service designed to be hosted in the cloud edge, where it would accept high-bandwidth data pipelines from sources, apply machine-learning tools to analyze and understand received content, perform initial data transformations such as image segmentation, tagging and other basic AI functions, and support ways to query the resulting knowledge base with minimal delay. Such services would also need to scale out, yet must maintain their rapid responsiveness and strong consistency. Derecho, which is now fully implemented (github.org/Derecho-Project), leverages persistent memory and RDMA to solve this problem with exceptional performance and scalability. Derecho is also interesting from a theoretical perspective. In particular, the core protocols used implement Paxos state machine replication in a novel manner optimized for RDMA settings. These protocols have been proved correct, and are also highly efficient in terms of delay before message delivery, progress during failures and even the mapping to RDMA hardware.
Bio: Ken Birman is the N. Rama Rao Professor of Computer Science at Cornell. An ACM Fellow and the winner of the IEEE Tsutomu Kanai Award, Ken has written 3 textbooks and published more than 150 papers in prestigious journals and conferences. Software he developed operated the New York Stock Exchange for more than a decade without trading disruptions, and plays central roles in the French Air Traffic Control System and the US Navy AEGIS warship. Other technologies from his group found their way into IBM’s Websphere product, Amazon’s EC2 and S3 systems, Microsoft’s cluster management solutions, and the US Northeast bulk power grid.   His Vsync system (vsync.codeplex.com) has become a widely used teaching tool for students learning to create secure, strongly consistent and scalable cloud computing solutions. Derecho is intended for demanding settings such as the smart power grid, smart highways and homes, and scalable vision systems.

Christina Delimitrou, Cornell ECE, “Leveraging Machine Learning to Improve Performance Predictability in Microservices”
Abstract: Cloud applications have recently undergone a major redesign, switching from monolithic implementations that encompass the entire functionality of the application in a single binary, to large numbers of loosely-coupled microservices. This shift comes with several advantages, namely increased speed of development and deployment, however, it also introduces several new challenges. One of these challenges comes from the dependencies between microservices, which complicate scheduling and resource management, as any poorly-managed microservice can introduce end-to-end QoS violations. Manually discovering the impact of these dependencies becomes increasingly impractical as applications and systems scale. In this talk I will discuss Seer, a new cloud performance debugging system that leverages deep learning techniques to improve the performance predictability of large-scale interactive microservices in a practical way.
Bio: Christina Delimitrou is an Assistant Professor and the John and Norma Balen Sesquicentennial Faculty Fellow at Cornell University, where she works on computer architecture and computer systems. Specifically, Christina focuses on improving the resource efficiency of large-scale cloud infrastructures by revisiting the way these systems are designed and managed. She is the recipient of an NSF CAREER Award, a 2019 Google Faculty Research Award, a 2018 Facebook Faculty Award, 3 IEEE Micro Top Picks awards, a Facebook Fellowship, and a Stanford Graduate Fellowship. Before joining Cornell, Christina received her PhD from Stanford University. She had previously received an MS from Stanford, and a diploma in Electrical and Computer Engineering from the National Technical University of Athens. More information can be found at: http://sail.ece.cornell.edu

Chris De Sa, Cornell, “Distributed learning with compressed communication”
Abstract: Distributed machine learning involves many independent computers collaborating to train a model. Often, the performance of distributed ML is limited by the cost of communication, even on relatively well-provisioned networks, because of the large volumes of data involved in ML training. This is especially the case in learning applications at the cloud edge, where some of the learners are located outside the datacenter (for example, IoT devices from which data could be sourced). In this talk, I will present some recent work on how the messages sent between machines can be compressed, so they take up less network bandwidth, without affecting the convergence rate of the learning algorithm. This compression can significantly decrease the total time needed to train a model, especially when the model is trained over networks of limited capacity.
Bio: Chris De Sa is an Assistant Professor in the Computer Science department at Cornell University. His research interests include algorithmic, software, and hardware techniques for high-performance machine learning, with a focus on relaxed-consistency variants of stochastic algorithms such as asynchronous and low-precision stochastic gradient descent (SGD) and Markov chain Monte Carlo. His work builds towards using these techniques to construct data analytics and machine learning frameworks, including for deep learning, that are efficient, parallel, and distributed. He is a field member in ECE and Statistics, as well as a member of the Cornell Machine Learning Group.

Yang Seok Ki, Samsung, “Opportunities and Challenges in Computational Storage”
Abstract: As data exponentially grows, the traditional von Neumann architecture has revealed many limitations. These new workloads prevent the CPU cache from functioning properly while the cost of moving data over the network is a dominant factor as the system size increases. These observations motivated near-data processing in the network and storage fields. The combination of near data processing technology and domain-specific architecture opens up new computing opportunities that enable better computing with lower power consumption and cost. At this talk, Dr. Ki introduces several approaches at Samsung and discusses cases and lessons.
Bio: Yang Seok Ki is a Sr. Director and architect of Memory Solutions Lab, Samsung Semiconductor Inc. America. He leads two research groups on computational storage and performance engineering whose main focus are to innovate SSD and its ecosystem across datacenter hardware and software infrastructure. He is leading several data-centric computing projects such as Key Value SSD, SmartSSD, and more. Before joining Samsung, he worked for Oracle server technology group that builds a distributed database server system, and contributed to Oracle 12c release. Prior to his industrial experience, he worked on HPDC (High Performance Distributed Computing), Grid, and Cloud research in Information Sciences Institute of University of Southern California and Center of Networked Systems, University of California, San Diego. He received his Ph.D. degree of Electrical Engineering and Computer Science in parallel processing, his M.S. degree of Computer Engineering, and B.S. degree of Computer Engineering from Seoul National University, Korea.

Jian Li, Huawei, “Case Studies of ICT Platform and Its Applications”
Abstract: Information and Communication Technologies (ICT) technologies enable the future connected and intelligent world. In spite of constant and prominent progress in ICT, its infrastructure support to such a grand vision of future society still faces great challenges in scalability, performance, security, privacy and usability issues. In light of these mounting problems, we attempt to tackle the tip of the iceberg with a few initial case studies and call for attention from the research community.
Bio: Dr. Jian Li is a senior director of research and technology planning with Huawei Technologies, where he leads its technology planning and R&D efforts in North America, in collaboration with global teams around the world. Before joining Huawei, he was an executive architect and research scientist with IBM Research, where he worked on advanced R&D, multi-site product development and global customer engagements on systems and data solutions with significant revenue growth. He holds over 40 patents and has published over 40 peer-reviewed papers. He earned a Ph.D. in electrical and computer engineering from Cornell University. He has also held adjunct or visiting scholar positions at Texas A&M University, Chinese Academy of Sciences and Tsinghua University. In this capacity, he continues to work with academia and industry experts around the world.

Jose F. Martínez, Cornell, “PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services”
Abstract: Multi-tenancy in modern datacenters is currently limited to a single latency-critical, interactive service, running alongside one or more low-priority, best-effort jobs. This limits the efficiency gains from multi-tenancy, especially as an increasing number of cloud applications are shifting from batch jobs to services with strict latency requirements. We present PARTIES, a QoS-aware resource manager that enables an arbitrary number of interactive, latency-critical services to share a physical node without QoS violations. PARTIES leverages a set of hardware and software resource partitioning mechanisms to adjust allocations dynamically at runtime, in a way that meets the QoS requirements of each co-scheduled workload, and maximizes throughput for the machine. We evaluate PARTIES on state-of-the-art server platforms across a set of diverse interactive services. Our results show that PARTIES improves throughput under QoS by 61% on average, compared to existing resource managers, and it is able to adapt to varying load patterns.
Bio: José Martínez is Professor of Electrical and Computer Engineering, and member of the Computer Science and Systems Engineering graduate fields at Cornell. His research work has earned several awards; among them: two IEEE Micro Top Picks papers; an HPCA Best Paper award; Best Paper nominations at MICRO and HPCA; an NSF CAREER award; two IBM Faculty awards; and one of the inaugural UIUC Computer Science Outstanding Educator Alumnus awards. On the teaching side, he has been recognized with one Dorothy & Fred Chau MS’74 and two Kenneth A. Goldman '71 College teaching awards; a Ruth and Joel Spira Teaching Excellence award; as the most influential college professor of Merrill Presidential Scholars Andrew Tibbits (2007) and Gulnar Mirza (2016); and as the 2011 Professor of the Year by the Tau Beta Pi Engineering Honor Society.

Ed McLellan, Marvell, “Computing on the Mobile Edge”
Abstract: “Computing on the Mobile Edge”
As cloud computing becomes increasingly pervasive, a capable edge computing infrastructure is necessary to deliver the anticipated services and provide a platform for innovation. Mobile end points are a primary growth driver of internet traffic which makes the RAN (radio access network), or more likely the C-RAN (cloud radio access network) an ideal location to host new services if general purpose compute capability is available. The presentation will review 5G requirements and describe hardware and software solution options.  Bio: Ed McLellan is a Principal engineer at Marvell Semiconductor working on the OcteonTX2, multi-core infrastructure processor. In the past, he’s designed hardware and firmware for cores and networking SoCs spanning 7 different ISAs at DEC, C-Port/Motorola, AMD and Cavium/Marvell. He received a BS in Computer & Systems Engineering from RPI and holds about 25 patents.

Richard New, Western Digital, “The Changing Landscape of Data – New Storage Device Technologies, Interface Models, and Storage Architectures”
Abstract: The two workhorse technologies for data storage are hard disk drives based on magnetic recording and solid state drives based on NAND flash. Both technologies are evolving in ways that require changes to basic usage models and to the storage interface, as well as corresponding changes to the software storage stack. The Zoned Block Device (ZBC) interface has been introduced to enable Shingled Magnetic Recording (SMR) for HDD. Likewise, the Open Channel and Zoned Namespace (ZNS) interface standards have been proposed to address inefficiencies in the common usage model for NAND SSDs. This talk will review emerging use cases and interface models for storage, and the changes in system software and application design required to support them.
Bio: Richard New has served as the vice president of research at Western Digital since 2016 where he oversees the company’s advanced research across a range of topics relating to storage, memory and compute. In this role, he manages the Western Digital Research Lab, focusing on exploratory research and advanced technology development in emerging non-volatile memories, microprocessor architecture (RISC-V), storage and memory interface standards, advanced storage architecture concepts and prototyping, and open source software projects relating to storage and memory architecture. Prior to this, he was the lab director of San Jose Research Center at Hitachi GST (now Western Digital), working on advanced magnetic recording technologies such as heat-assisted magnetic recording, as well as advanced storage architecture concepts. Previously, Richard held a variety of research and manager roles at IBM Research where he worked in the areas of magnetic recording, signal processing and error correction. Richard received a BS in Electrical Engineering from the University of Waterloo, and an MS and PhD in Electrical Engineering from Stanford University. Richard was born in Cambridge England and grew up in Waterloo Canada. He currently resides in Palo Alto with his family and enjoys working on robotics and go-carts with his kids.

Kevin Tang, Cornell, Towards Autonomous Fine-Grained Network Management
Abstract: The current Internet is largely a heterogeneous network in the sense that it is a network that is formed by many different networks owned by different identities. Its ability to work with vastly different technologies under very different circumstances in parallel certainly has brought crucial flexibility and incentives for its growth. On the other hand, its heterogeneity has also led to rigid and coarse management which makes Internet’s performance insufficient to support the fast growing traffic with increasingly stricter quality of service requirements. Today, with the rise of cloud computing and software-defined networking, it is now possible to move towards autonomous fine-grained network management. In this talk, we examine opportunities that come from three different aspects: space (finer traffic split), time (faster network load balancing) and application (route different applications differently). We study both the idealized fine-grained network management, which sets a limit for the best possible outcome, and algorithms that take into account important practical factors to approach that limit. The talk ends with some example virtual network functions (VNFs) that can be built under fine-grained network management to further improve network performance.
Bio: Kevin Tang is an Associate Professor in ECE at Cornell University, where he conducts research on control and optimization of computer networks with a focus on network management and optimization. He cofounded MODE (https://www.mode.net/) withNithin Michael to build the first fully autonomous, global software-defined private network for enterprise users. Kevin received his PhD degree in electrical engineering with a minor in applied and computational mathematics from Caltech in 2006. He received the Presidential Early Career Award for Scientist and Engineers (PECASE) from the White House in 2012.

Robbert van Renesse, Cornell, X-Containers: Breaking Down Barriers to Improve Performance and Isolation of Cloud-Native Containers
Abstract: “Cloud-native” container platforms, such as Kubernetes, have become an integral part of production cloud environments. One of the principles in designing cloud-native applications is called Single Concern Principle, which suggests that each container should handle a single responsibility well.  We propose X-Containers as a new security paradigm for isolating single-concerned cloud-native containers. Each container is run with a Library OS (LibOS) that supports multi-processing for concurrency and compatibility. A minimal exokernel ensures strong isolation with small kernel attack surface. We show an implementation of the X-Containers architecture that leverages Xen para-virtualization (PV) to turn Linux kernel into a LibOS. Doing so results in a highly efficient LibOS platform that does not require hardware-assisted virtualization, improves inter-container isolation, and supports binary compatibility and multi-processing. X-Containers have up to 27× higher raw system call throughput compared to Docker containers, while also achieving competitive or superior performance on various benchmarks compared to recent container platforms such as Google’s gVisor and Intel’s Clear Containers.
Bio: Robbert van Renesse is a Research Professor in the Department of Computer Science at Cornell University, interested in the theory and practice of fault tolerant, secure, and scalable distributed systems. He has developed widely used distributed algorithms such as Chain Replication and Scuttlebutt (State Reconciliation for Gossip Protocols). He is a Fellow of the ACM, and currently serves as Chair of the ACM Special Interest Group on Operating Systems (SIGOPS).

H. K. Verma, Distinguished Engineer, Xilinx, “Database Acceleration using FPGAs”
Abstract: Ever larger datasets are leading to tremendous compute requirements for real-time analytics. Both centralized and distributed CPU solutions often fall short in performance. One solution is to offload compute to a field customizable hardware implementation on FPGA. This approach can provide 8-20x acceleration over other methods. In this talk, we provide a generic query offload acceleration architecture for PostgreSQL RDBMS. The database acceleration stack is currently hosted on Amazon AWS-F1 instance, and work with compute or integrated storage platforms provided by Xilinx. We will also provide query acceleration benchmarks using Xilinx HBM devices overcoming memory bandwidth bottlenecks. These software solutions and platform framework can be used by researchers to break performance bottlenecks in big data applications.
Bio: HK Verma is a Distinguished Engineer at Xilinx, where he is developing FPGA-based data center accelerator solutions. His focus is on software for heterogeneous platforms to deliver breakthrough results in big data analytics, queries and machine learning. He has pioneered successful accelerator adoption of FPGA acceleration into database an storage solutions working with leading hyper-scale customers. Prior to this, he was an architect and designer of FPGA at Xilinx and CPUs at Intel. He was also co-founder/VP at Velogix, a startup offering programmable compute acceleration with silicon and software. He holds 36 issued US patents and has presented tutorials and papers at leading conferences. He holds an MSEE from University of California at Santa Barbara and a Bachelor of Technology in EE from IIT Madras.
Hakim Weatherspoon, Cornell, “The Edge Supercloud: Blockchains for the Edge”
Abstract: While the intersection of blockchains and the Internet of Things (IoT) have received considerable research interest lately, Nakamoto-style blockchains possess a number of qualities that make them poorly suited for many IoT scenarios. Specifically, they require high network connectivity and are power-intensive. This is a drawback in IoT environments where battery-constrained nodes form an unreliable ad hoc network such as in digital agriculture. In this talk, I will present Vegvisir, a partition-tolerant blockchain for use in power-constrained IoT environments with limited network connectivity. It is a permissioned, directed acyclic graph (DAG)-structured blockchain that can be used to create a shared, tamperproof data repository that keeps track of data provenance. I will discuss the use cases, architecture, and challenges of such a blockchain.
Bio: Hakim Weatherspoon is an Associate Professor in the Department of Computer Science at Cornell University and Associate Directory for Cornell’s Imitative for Digital Agriculture (CIDA). His research interests cover various aspects of fault-tolerance, reliability, security, and performance of internet-scale data systems such as cloud and distributed systems.  Weatherspoon received is Bachelors from the University of Washington and PhD from University of California, Berkeley. Weatherspoon has received awards for his many contributions, including an the University of Washington, Allen School of Computer Science and Engineering, Alumni Achievement Award; Alfred P. Sloan Research Fellowship; National Science Foundation CAREER Award; and a Kavli Fellowship from the National Academy of Sciences. He serves as Vice President of the USENIX Board of Directors and serves on the Steering Committee for the ACM Symposium on Cloud Computing.Weatherspoon has also been recognized for his work to promote diversity, earning Cornell's Zellman Warhaft Commitment to Diversity Award. Since 2011, he has organized the annual SoNIC Summer Research Workshop to help prepare between students from underrepresented groups to pursue their Ph.D. in computer science.

Carole-Jean Wu, Facebook, “Machine Learning at Scale”
Abstract: Machine learning systems are being widely deployed in production datacenter infrastructure and over billions of edge devices. This talk seeks to address key system design challenges when scaling machine learning solutions to billions of people. What are key similarities and differences between cloud and edge infrastructure? The talk will conclude with open system research directions for deploying machine learning at scale.
Bio: Carole-Jean Wu is a Research Scientist at Facebook’s AI Infrastructure Research. She is also a tenured Associate Professor of CSE in Arizona State University. Carole-Jean’s research focuses in Computer and System Architectures. More recently, her research has pivoted into designing systems for machine learning. She is the leading author of “Machine Learning at Facebook: Understanding Inference at the Edge” that presents unique design challenges faced when deploying ML solutions at scale to the edge, from over billions of smartphones to Facebook’s virtual reality platforms. Carole-Jean received her M.A. and Ph.D. from Princeton and B.Sc. from Cornell.