StartEndSession TitleSpeakers
10:00 AM10:10 AMDay1: AM Plenary: Introduction Yasaman Ghadar; Nick Romero
10:10 AM10:30 AMDay1:  AM Plenary: An Overview of the Argonne Aurora Exascale System (Slides), (Video)Scott Parker
10:30 AM10:50 AMDay1:  AM Plenary: Oak Ridge Leadership Computing: Science on Summit and Preparing for Frontier (Slides), (Video)Tjerk Straatsma
10:50 AM11:10 AMDay1:  AM Plenary: Preparing Apps for the Perlmutter System and beyond at NERSC (Slides), (Video)Jack Deslippe
11:10 AM11:30 AMSession 1: Talk 1 –   A Comparison of GPU Programming Models (Slides), (Video)Trey White
11:10 AM12:10 PMSession 2:  Talk 1 – Prioritzing OpenMP Features to Provide for Performance, Portability and Productivity (Slides), (Video)Oscar Hernandez; Vivek Kale
11:10 AM11:30 AMSession 3:  Talk 1 – AMReX in 2020:  Porting for Performance to GPGPU Systems (Slides), (Video)Kevin Gott
11:30 AM11:50 AMSession 1:  Talk 2 – Wrong Way:  Successes, Failures, and Lessons Learned from Using the “Wrong” Programming Approach for Summit (Slides), (Video)Philip Roth
11:30 AM11:50 AMSession 3:  Talk 2 – An overview of FleCSI:  a compile-time configurable framework designed to support multi-physics application development (Slides), (Video)Irina Demeshko
11:50 AM12:10 PMSession 1:  Talk 3 – The Importance of Kernels for Performance Portability, or How I Learned to Stop Looping and Love the Kernel (Slides), (Video)Tom Scogland
11:50 AM12:10 PMSession 3:  Talk 3 – Performance Portable Implementations for Kernels from Geometric Multigrid Methods (Slides)JaeHyuk Kwack
12:10 PM12:30 PMSession 1:  Talk 4 – A Study of Cross-Platform, Cross-Compiler Performance and Portability (Slides), (Video)Veronica G. Vergara Larrea
12:10 PM12:30 PMSession 2:  Talk 2 – Experience with using OpenACC and OpenMP to achieve performance portability for the Grid lattice QCD library (Slides), (Video)Meifeng Lin
12:10 PM12:30 PMSession 3:  Talk 4 – Porting QUDA from CUDA to other backends (Slides), (Video)Xiaoyong Jin
12:30 PM01:25 PMLunch Break
01:25 PM01:45 PMDay 1: PM Plenary: The Fruits of the Interplay between DOE Software Stacks and C++ Standardization (Video)Daisy Hollman
01:25 PM02:40 PMDay 1:  PM Plenary: Programming Methodologies for Performance Portability, Current State of Practice and Futures Douglas Doerfler; Brandon Cook
01:45 PM02:05 PMDay 1: PM Plenary: LLNL Portability Abstractions Update (Slides), (Video)Rich Hurnung
02:05 PM02:25 PMDay 1: PM Plenary: Kokkos: Present and Future (Slides), (Video)Christian Trott
02:25 PM02:40 PMDay 1: PM Plenary: Panel Q&A
02:40 PM03:00 PMSession 4:  Talk 1 – Update on the performance portability of a Wilson Dslash Kernel using Kokkos and SYCL (Slides), (Video)Balint Joo
02:40 PM03:00 PMSession 5:  Talk 1 – Preparing performance portable QMCPACK for Exascale (Slides), (Video)Ye Luo
02:40 PM03:00 PMSession 6:  Talk 1 – Evaluating the performance of a portable version of HPGMG benchmark for accelerators (Slides), (Video)Christopher Daley
03:00 PM03:20 PMSession 4:  Talk 2 – Early SYCL results from the Bristol Performance Portability Study (Slides), (Video)Tom Deakin
03:00 PM03:20 PMSession 5:  Talk 2 – Performance portability experience with CUDA and OpenMP offload in GAMESS (Slides) (Video)Colleen Bertoni
03:00 PM03:20 PMSession 6:  Talk 2 – SU3_bench, a Micro-benchmark for Exploring Exascale Era Programming Models, Compilers and Runtimes (Slides), (Video)Douglas Doerfler
03:20 PM03:40 PMSession 4:  Talk 3 – Experiences tuning SYCL libraries for varied hardware (Slides), (Video)John Lawson
03:20 PM03:40 PMSession 5:  Talk 3 – OpenMP Offloading For Density Matrix Renormalization Group Hamiltonian Application Kernel (Slides), (Video)Wael Elwasif
03:20 PM03:40 PMSession 6:  Talk 3 – Performance Portability Issues for a Large-Scale Computational Fluid Dynamics Application on Emerging High-Performance Architectures (Slides), (Video)Eric Nielsen
03:40 PM04:00 PMSession 4:  Talk 4 – Investigation of the performance of SYCL kernels across various architectures (Slides), (Video)Brian Homerding
03:40 PM04:00 PMSession 5:  Talk 4 – GPU I-TASSER: Replica Exchange Monte Carlo (Slides), (Video)Elijah MacCarthy
03:40 PM04:00 PMSession 6:  Talk 4 – Performance and portability of abstract algebra operations in C++, Python, and Julia (Slides), (Video)Jess Woods
04:00 PM04:20 PMAfternoon Break
04:20 PM05:00 PMPoster Session 1 – Poster 1: Comparative Performance and Porting Effort of HIP and CUDA for an Implicit Monte Carlo Code (Slides), (Video)Alex Long
04:20 PM05:00 PMPoster Session 1 – Poster 2:  Experiences with CUDA Streaming in Teton’s Linear Sweep (Slides), (Video)Robert Chen
04:20 PM05:00 PMPoster Session 1 – Poster 3:  Kokkos-HClib: enabling high-performance and resiliency for HPC systems (Slides), (Video)Akihiro Hayashi; Nicolas Morales
04:20 PM05:00 PMPoster Session 1 – Poster 4:  Parallel Training of Large Knowledge Graph Convolution Networks (Slides), (Video)Hong-Jun Yoon
04:20 PM05:00 PMPoster Session 1 – Poster 5:  Productive Auto-Tuning of GPU Codes via Iterative Machine Learning Wu Feng
04:20 PM05:00 PMPoster Session 1 – Poster 6:  Halo Exchange Performance on the Sierra Supercomputer (Slides), (Video)Jason Burmark
04:20 PM05:00 PMPoster Session 1 – Poster 7:  Accelerating Your Application I/O on HPC Systems (Slides), (Video)Kathryn Mohror; Cameron Stanavige
04:20 PM05:00 PMPoster Session 1 – Poster 8:  Preparing the SUNDIALS Library for Heterogeneous Architectures (Slides), (Video)Cody Balos
10:00 AM10:35 AMDay 2:  AM Plenary:  Intel Data Parallel C++ (Slides), (Video)Jeff Hammond
10:35 AM11:10 AMDay 2:  AM Plenary:  Lessons Learned in the Sierra Center of Excellence Migrating to Heterogenous Computing (Slides), (Video)David Richards
11:10 AM11:30 AMSession 7:  Talk 1 – XGC with Kokkos/Cabana: Plasma Physics on Summit and Beyond (Slides), (Video)Aaron Scheinberg
11:10 AM11:30 AMSession 8:  Talk 1 – Towards Performance Portability through an Integrated Programming Eco-System for Tensor Algebra (Slides), (Video)Roberto Gioiosa
11:10 AM11:30 AMSession 9:  Talk 1 – Application Development and Readiness for Sierra: An MPI Challenge (Slides), (Video)James Elliott
11:30 AM11:50 AMSession 7:  Talk 2 – Experiences incrementally porting a large legacy finite element application to Sierra using Kokkos (Slides), (Video)Victor Brunini
11:30 AM11:50 AMSession 8:  Talk 2 – Enabling High-level Parallel Abstractions to Dynamical Cluster Approximation (DCA++) using HPX and GPUDirect on the Summit Supercomputer (Slides), (Video)Weile Wei
11:30 AM11:50 AMSession 9:  Talk 2 – What workloads are coming: new workflow mashups, HPC*AI, HPC at the Edge (Slides), (Video)CJ Newburn
11:50 AM12:10 PMSession 7:  Talk 3 – Fortran Language Compatibility Library for Kokkos (Slides), (Video)Geoff Womeldorff
11:50 AM12:10 PMSession 8:  Talk 3 – From PeleC to PeleACC, to PeleC++: What we learned porting our AMReX application to two modern GPU programming models (Slides), (Video)Jon Rood
11:50 AM12:10 PMSession 9:  Talk 3 – Developing Applications for Aurora (Slides), (Video)Scott Parker
12:10 PM12:30 PMSession 7:  Talk 4 – Resilient Kokkos: Productive and Performance Portable User-Level  Checkpointing for High Performance Computing (Slides), (Video)Nicolas Morales
12:10 PM12:30 PMSession 8:  Talk 4 – Porting EOSPAC 6 to Sierra (Slides), (Video)Anna Pietarila Graham
12:10 PM12:30 PMSession 9:  Talk 4 – Modern problems require modern solutions: How modern CMake supports modern C++ in performance portability (Slides), (Video)Jeremiah Wilke
12:30 PM01:25 PMLunch Break
01:25 PM01:35 PMDay 2:  PM Plenary:  El Capitan (Slides), (Video)Ian Karlin
01:35 PM01:45 PMDay 2:  PM Plenary:  Crossroads (Slides), (Video)Hai Ah Nam
01:45 PM02:20 PMDay 2:  PM Plenary:  Best practices from the Summit application readiness efforts, with porting of GronOR as example (Slides), (Video)Tjerk Straatsma
02:20 PM02:40 PMSession 10:  Talk 1 – Achieving portability for a highly optimized GPU code for 3D Fourier Transforms at extreme problem sizes (Slides), (Video)Kiran Ravikumar
02:20 PM02:40 PMSession 11:  Talk 1 – Achieving Performance Portability on Hybrid GPU-CPU Architectures for a Large Scale Material Science Code: the BerkeleyGW Case Study (Slides), (Video)Mauro Del Ben
02:20 PM02:40 PMSession 12:  Talk 1 – Timemory: Modular Performance Analysis for HPC (Slides), (Video)Jonathan Madsen
02:40 PM03:00 PMSession 10:  Talk 2 – Asynchronous Programming in Modern C++ (Slides), (Video)Hartmut Kaiser
02:40 PM03:00 PMSession 11:  Talk 2 – Performance Portability of Remote I/O in Distributed Workflows (Slides), (Video)Nathan Tallent
02:40 PM03:00 PMSession 12:  Talk 2 – The Development and Uses of Metrics for Performance, Portability, and Productivity (Slides), (Video)John Pennycook
03:00 PM03:20 PMSession 10:  Talk 3 – Porting Numerical Linear Algebra Libraries across Exascale Hardware Platforms and Beyond (Slides), (Video)Piotr Luszczek
03:00 PM03:20 PMSession 11:  Talk 3 – Machine Learning Guided Optimal Use of GPU Unified Memory (Slides), (Video)Murali Emani
03:00 PM03:20 PMSession 12:  Talk 3 – Techniques for Holistic Quantification of Performance, Portability, and Productivity (Slides), (Video)Jason Sewall
03:20 PM03:40 PMConcluding RemarksScott Parker
03:40 PM04:00 PMAfternoon Break
04:00 PM04:40 PMPoster Session 2 – Poster 1:  Porting the NOAA global weather forecast system to the cloud (Slides)Daniel Abdi
04:00 PM04:40 PMPoster Session 2 – Poster 2:  SHAD: Productive Programming for High Performance Systems in Standard C++ (Slides), (Video)Vito Giovanni Castellana
04:00 PM04:40 PMPoster Session 2 – Poster 3:  A MetaCL Tool for Productive FPGA Programming via Automated Code Generation (Slides), (Video)Paul Sathre
04:00 PM04:40 PMPoster Session 2 – Poster 4:  Development of GPU accelerated Smith-Waterman kernel for metagenomics workflow (Slides), (Video)Muaaz Awan
04:00 PM04:40 PMPoster Session 2 – Poster 5:  Considerations for performance portability in a commercial particle-in-cell code (Slides), (Video)Benjamin Cowan
04:00 PM04:40 PMPoster Session 2 – Poster 6:  CAMP: Compiler Agnostic MetaProgramming, or Portable Performance at Compile Time (Slides), (Video)Tom Scogland
04:00 PM04:40 PMPoster Session 2 – Poster 7:  Performance Analysis of NALU on Arch64 (Slides), (Video)Srinath Vadlamani
04:00 PM04:40 PMPoster Session 2 – Poster 8:  A Cosine Similarity Methodology to Characterize Proxy-Parent Application Correspondence (Slides)Omar Aaziz; Jeffery Kuehn