Start | End | Session Title | Speakers |
10:00 AM | 10:10 AM | Day1: AM Plenary: Introduction | Yasaman Ghadar; Nick Romero |
10:10 AM | 10:30 AM | Day1: AM Plenary: An Overview of the Argonne Aurora Exascale System (Slides), (Video) | Scott Parker |
10:30 AM | 10:50 AM | Day1: AM Plenary: Oak Ridge Leadership Computing: Science on Summit and Preparing for Frontier (Slides), (Video) | Tjerk Straatsma |
10:50 AM | 11:10 AM | Day1: AM Plenary: Preparing Apps for the Perlmutter System and beyond at NERSC (Slides), (Video) | Jack Deslippe |
11:10 AM | 11:30 AM | Session 1: Talk 1 – A Comparison of GPU Programming Models (Slides), (Video) | Trey White |
11:10 AM | 12:10 PM | Session 2: Talk 1 – Prioritzing OpenMP Features to Provide for Performance, Portability and Productivity (Slides), (Video) | Oscar Hernandez; Vivek Kale |
11:10 AM | 11:30 AM | Session 3: Talk 1 – AMReX in 2020: Porting for Performance to GPGPU Systems (Slides), (Video) | Kevin Gott |
11:30 AM | 11:50 AM | Session 1: Talk 2 – Wrong Way: Successes, Failures, and Lessons Learned from Using the “Wrong” Programming Approach for Summit (Slides), (Video) | Philip Roth |
11:30 AM | 11:50 AM | Session 3: Talk 2 – An overview of FleCSI: a compile-time configurable framework designed to support multi-physics application development (Slides), (Video) | Irina Demeshko |
11:50 AM | 12:10 PM | Session 1: Talk 3 – The Importance of Kernels for Performance Portability, or How I Learned to Stop Looping and Love the Kernel (Slides), (Video) | Tom Scogland |
11:50 AM | 12:10 PM | Session 3: Talk 3 – Performance Portable Implementations for Kernels from Geometric Multigrid Methods (Slides) | JaeHyuk Kwack |
12:10 PM | 12:30 PM | Session 1: Talk 4 – A Study of Cross-Platform, Cross-Compiler Performance and Portability (Slides), (Video) | Veronica G. Vergara Larrea |
12:10 PM | 12:30 PM | Session 2: Talk 2 – Experience with using OpenACC and OpenMP to achieve performance portability for the Grid lattice QCD library (Slides), (Video) | Meifeng Lin |
12:10 PM | 12:30 PM | Session 3: Talk 4 – Porting QUDA from CUDA to other backends (Slides), (Video) | Xiaoyong Jin |
12:30 PM | 01:25 PM | Lunch Break | |
01:25 PM | 01:45 PM | Day 1: PM Plenary: The Fruits of the Interplay between DOE Software Stacks and C++ Standardization (Video) | Daisy Hollman |
01:25 PM | 02:40 PM | Day 1: PM Plenary: Programming Methodologies for Performance Portability, Current State of Practice and Futures | Douglas Doerfler; Brandon Cook |
01:45 PM | 02:05 PM | Day 1: PM Plenary: LLNL Portability Abstractions Update (Slides), (Video) | Rich Hurnung |
02:05 PM | 02:25 PM | Day 1: PM Plenary: Kokkos: Present and Future (Slides), (Video) | Christian Trott |
02:25 PM | 02:40 PM | Day 1: PM Plenary: Panel Q&A | |
02:40 PM | 03:00 PM | Session 4: Talk 1 – Update on the performance portability of a Wilson Dslash Kernel using Kokkos and SYCL (Slides), (Video) | Balint Joo |
02:40 PM | 03:00 PM | Session 5: Talk 1 – Preparing performance portable QMCPACK for Exascale (Slides), (Video) | Ye Luo |
02:40 PM | 03:00 PM | Session 6: Talk 1 – Evaluating the performance of a portable version of HPGMG benchmark for accelerators (Slides), (Video) | Christopher Daley |
03:00 PM | 03:20 PM | Session 4: Talk 2 – Early SYCL results from the Bristol Performance Portability Study (Slides), (Video) | Tom Deakin |
03:00 PM | 03:20 PM | Session 5: Talk 2 – Performance portability experience with CUDA and OpenMP offload in GAMESS (Slides) (Video) | Colleen Bertoni |
03:00 PM | 03:20 PM | Session 6: Talk 2 – SU3_bench, a Micro-benchmark for Exploring Exascale Era Programming Models, Compilers and Runtimes (Slides), (Video) | Douglas Doerfler |
03:20 PM | 03:40 PM | Session 4: Talk 3 – Experiences tuning SYCL libraries for varied hardware (Slides), (Video) | John Lawson |
03:20 PM | 03:40 PM | Session 5: Talk 3 – OpenMP Offloading For Density Matrix Renormalization Group Hamiltonian Application Kernel (Slides), (Video) | Wael Elwasif |
03:20 PM | 03:40 PM | Session 6: Talk 3 – Performance Portability Issues for a Large-Scale Computational Fluid Dynamics Application on Emerging High-Performance Architectures (Slides), (Video) | Eric Nielsen |
03:40 PM | 04:00 PM | Session 4: Talk 4 – Investigation of the performance of SYCL kernels across various architectures (Slides), (Video) | Brian Homerding |
03:40 PM | 04:00 PM | Session 5: Talk 4 – GPU I-TASSER: Replica Exchange Monte Carlo (Slides), (Video) | Elijah MacCarthy |
03:40 PM | 04:00 PM | Session 6: Talk 4 – Performance and portability of abstract algebra operations in C++, Python, and Julia (Slides), (Video) | Jess Woods |
04:00 PM | 04:20 PM | Afternoon Break | |
04:20 PM | 05:00 PM | Poster Session 1 – Poster 1: Comparative Performance and Porting Effort of HIP and CUDA for an Implicit Monte Carlo Code (Slides), (Video) | Alex Long |
04:20 PM | 05:00 PM | Poster Session 1 – Poster 2: Experiences with CUDA Streaming in Teton’s Linear Sweep (Slides), (Video) | Robert Chen |
04:20 PM | 05:00 PM | Poster Session 1 – Poster 3: Kokkos-HClib: enabling high-performance and resiliency for HPC systems (Slides), (Video) | Akihiro Hayashi; Nicolas Morales |
04:20 PM | 05:00 PM | Poster Session 1 – Poster 4: Parallel Training of Large Knowledge Graph Convolution Networks (Slides), (Video) | Hong-Jun Yoon |
04:20 PM | 05:00 PM | Poster Session 1 – Poster 5: Productive Auto-Tuning of GPU Codes via Iterative Machine Learning | Wu Feng |
04:20 PM | 05:00 PM | Poster Session 1 – Poster 6: Halo Exchange Performance on the Sierra Supercomputer (Slides), (Video) | Jason Burmark |
04:20 PM | 05:00 PM | Poster Session 1 – Poster 7: Accelerating Your Application I/O on HPC Systems (Slides), (Video) | Kathryn Mohror; Cameron Stanavige |
04:20 PM | 05:00 PM | Poster Session 1 – Poster 8: Preparing the SUNDIALS Library for Heterogeneous Architectures (Slides), (Video) | Cody Balos |
10:00 AM | 10:35 AM | Day 2: AM Plenary: Intel Data Parallel C++ (Slides), (Video) | Jeff Hammond |
10:35 AM | 11:10 AM | Day 2: AM Plenary: Lessons Learned in the Sierra Center of Excellence Migrating to Heterogenous Computing (Slides), (Video) | David Richards |
11:10 AM | 11:30 AM | Session 7: Talk 1 – XGC with Kokkos/Cabana: Plasma Physics on Summit and Beyond (Slides), (Video) | Aaron Scheinberg |
11:10 AM | 11:30 AM | Session 8: Talk 1 – Towards Performance Portability through an Integrated Programming Eco-System for Tensor Algebra (Slides), (Video) | Roberto Gioiosa |
11:10 AM | 11:30 AM | Session 9: Talk 1 – Application Development and Readiness for Sierra: An MPI Challenge (Slides), (Video) | James Elliott |
11:30 AM | 11:50 AM | Session 7: Talk 2 – Experiences incrementally porting a large legacy finite element application to Sierra using Kokkos (Slides), (Video) | Victor Brunini |
11:30 AM | 11:50 AM | Session 8: Talk 2 – Enabling High-level Parallel Abstractions to Dynamical Cluster Approximation (DCA++) using HPX and GPUDirect on the Summit Supercomputer (Slides), (Video) | Weile Wei |
11:30 AM | 11:50 AM | Session 9: Talk 2 – What workloads are coming: new workflow mashups, HPC*AI, HPC at the Edge (Slides), (Video) | CJ Newburn |
11:50 AM | 12:10 PM | Session 7: Talk 3 – Fortran Language Compatibility Library for Kokkos (Slides), (Video) | Geoff Womeldorff |
11:50 AM | 12:10 PM | Session 8: Talk 3 – From PeleC to PeleACC, to PeleC++: What we learned porting our AMReX application to two modern GPU programming models (Slides), (Video) | Jon Rood |
11:50 AM | 12:10 PM | Session 9: Talk 3 – Developing Applications for Aurora (Slides), (Video) | Scott Parker |
12:10 PM | 12:30 PM | Session 7: Talk 4 – Resilient Kokkos: Productive and Performance Portable User-Level Checkpointing for High Performance Computing (Slides), (Video) | Nicolas Morales |
12:10 PM | 12:30 PM | Session 8: Talk 4 – Porting EOSPAC 6 to Sierra (Slides), (Video) | Anna Pietarila Graham |
12:10 PM | 12:30 PM | Session 9: Talk 4 – Modern problems require modern solutions: How modern CMake supports modern C++ in performance portability (Slides), (Video) | Jeremiah Wilke |
12:30 PM | 01:25 PM | Lunch Break | |
01:25 PM | 01:35 PM | Day 2: PM Plenary: El Capitan (Slides), (Video) | Ian Karlin |
01:35 PM | 01:45 PM | Day 2: PM Plenary: Crossroads (Slides), (Video) | Hai Ah Nam |
01:45 PM | 02:20 PM | Day 2: PM Plenary: Best practices from the Summit application readiness efforts, with porting of GronOR as example (Slides), (Video) | Tjerk Straatsma |
02:20 PM | 02:40 PM | Session 10: Talk 1 – Achieving portability for a highly optimized GPU code for 3D Fourier Transforms at extreme problem sizes (Slides), (Video) | Kiran Ravikumar |
02:20 PM | 02:40 PM | Session 11: Talk 1 – Achieving Performance Portability on Hybrid GPU-CPU Architectures for a Large Scale Material Science Code: the BerkeleyGW Case Study (Slides), (Video) | Mauro Del Ben |
02:20 PM | 02:40 PM | Session 12: Talk 1 – Timemory: Modular Performance Analysis for HPC (Slides), (Video) | Jonathan Madsen |
02:40 PM | 03:00 PM | Session 10: Talk 2 – Asynchronous Programming in Modern C++ (Slides), (Video) | Hartmut Kaiser |
02:40 PM | 03:00 PM | Session 11: Talk 2 – Performance Portability of Remote I/O in Distributed Workflows (Slides), (Video) | Nathan Tallent |
02:40 PM | 03:00 PM | Session 12: Talk 2 – The Development and Uses of Metrics for Performance, Portability, and Productivity (Slides), (Video) | John Pennycook |
03:00 PM | 03:20 PM | Session 10: Talk 3 – Porting Numerical Linear Algebra Libraries across Exascale Hardware Platforms and Beyond (Slides), (Video) | Piotr Luszczek |
03:00 PM | 03:20 PM | Session 11: Talk 3 – Machine Learning Guided Optimal Use of GPU Unified Memory (Slides), (Video) | Murali Emani |
03:00 PM | 03:20 PM | Session 12: Talk 3 – Techniques for Holistic Quantification of Performance, Portability, and Productivity (Slides), (Video) | Jason Sewall |
03:20 PM | 03:40 PM | Concluding Remarks | Scott Parker |
03:40 PM | 04:00 PM | Afternoon Break | |
04:00 PM | 04:40 PM | Poster Session 2 – Poster 1: Porting the NOAA global weather forecast system to the cloud (Slides) | Daniel Abdi |
04:00 PM | 04:40 PM | Poster Session 2 – Poster 2: SHAD: Productive Programming for High Performance Systems in Standard C++ (Slides), (Video) | Vito Giovanni Castellana |
04:00 PM | 04:40 PM | Poster Session 2 – Poster 3: A MetaCL Tool for Productive FPGA Programming via Automated Code Generation (Slides), (Video) | Paul Sathre |
04:00 PM | 04:40 PM | Poster Session 2 – Poster 4: Development of GPU accelerated Smith-Waterman kernel for metagenomics workflow (Slides), (Video) | Muaaz Awan |
04:00 PM | 04:40 PM | Poster Session 2 – Poster 5: Considerations for performance portability in a commercial particle-in-cell code (Slides), (Video) | Benjamin Cowan |
04:00 PM | 04:40 PM | Poster Session 2 – Poster 6: CAMP: Compiler Agnostic MetaProgramming, or Portable Performance at Compile Time (Slides), (Video) | Tom Scogland |
04:00 PM | 04:40 PM | Poster Session 2 – Poster 7: Performance Analysis of NALU on Arch64 (Slides), (Video) | Srinath Vadlamani |
04:00 PM | 04:40 PM | Poster Session 2 – Poster 8: A Cosine Similarity Methodology to Characterize Proxy-Parent Application Correspondence (Slides) | Omar Aaziz; Jeffery Kuehn |