Towards Accelerated Self-healing and Low-power System-on-Chip (SoC)

Date: 2020/08/26 - 2020/08/26

Academic Seminar: Towards Accelerated Self-healing and Low-power System-on-Chip (SoC)

Speaker: Xinfei Guo, NVIDIA

Time: 9:00-10:00 a.m. August 26th, 2020 (Wednesday)

Location:via Zoom (Meeting ID: 64555789745 Password: 1082)

Abstract

With the increasing needs of fitting more functionality on a single System on Chip (SoC) and growing complexity of design rules under newer semiconductor process nodes, chip design has become more and more stringent. Applications like AI and machine learning accelerator designs are beginning to shift directions as more computing tasks move to the edge, adding a level of sophistication and functionality that typically was relegated to the cloud, but in a power envelope compatible with a battery. On top of the power requirements, reliability becomes increasingly critical for intelligent computing such as autonomous systems, implantable devices and more. Reliability issues such as aging will not only shorten the lifetime of the chip, they will also degrade performance over time. The added margin or techniques for addressing reliability issues will further blow up the already limited power budget. As a consequence, designing a high-quality chip requires leveraging of a ton of decisions and tradeoffs.

In this talk, I will discuss my research on overcoming reliability and power issues for SoC. In the first part, I will introduce a completely new research direction on fixing chip reliability issues through accelerated self-healing, in which chip aging issues follow a "circadian rhythm" behavior and can be recovered via active techniques. I will discuss experiments, key findings and on-chip design infrastructures to support this solution. In the second part, I will present various low-power techniques at different abstractions of a system stack from semiconductor technology aspect, circuit design and microarchitectures up to new computing paradigm that further push the power envelope down; I In the last part of my talk, I will briefly preview future research directions on tackling industrial chip design challenges to embrace the trending computing needs.

Biography

Xinfei Guo received his Ph.D. in Computer Engineering from the University of Virginia in May 2018. He also held a M.S. degree in Electrical and Computer Engineering from the University of Florida and a B.Eng. degree in Electronic Material Science from Xidian University. Dr. Guo has broad interests and experiences in Very-large-scale integration (VLSI) design, Microarchitectures and Computing chip development. His PhD research was focused on chip design techniques for reliable and low power systems, those were represented by cross-layer accelerated self-healing techniques, asynchronous stochastic computing and reconfigurable security architectures for IoT applications. During his Ph.D., he also worked as a collaborating researcher at IBM T.J. Watson Research Center for one and a half years. He continued to work in the semiconductor industry after Ph.D. and is currently working as a Senior Chip Designer at NVIDIA in the US. His research interests include high-speed SoC design, EDA and machine learning hardware. He has authored a book and over 30 peer-reviewed publications, and has served as organizing or technical committee member for over 40 IEEE/ACM conferences. His work has been recognized by 3 best paper awards and an IEEE Circuits and Systems Society (CASS) Pre-doctoral Fellowship.