Jump to: About the Series | Upcoming webinars | Past webinars | 2023 | 2022 |2021 | 2020 | 2019 | 2018 | 2017 | 2016
Upcoming Webinars
Webinars are free and open to the public, but registration is required.
-
Taking HACC into the Exascale Era: New Code Capabilities, and Challenges [Register]
- Date and Time: Wednesday, October 11, 2023, 01:00 pm EDT
- Presenter: Esteban Rangel (Argonne National Laboratory)
- Description: HACC (Hardware/Hybrid Accelerated Cosmology Code) is a well-established code within the US Department of Energy community, and with a long history — having run on every flagship computing system for over a decade. Often participating in early-access programs for upcoming systems, an ongoing challenge for HACC developers is to not only contend with state-of-the-art architectures, but also with their initially supported, and often novel, programming models. The increased computing power brought about by today’s exascale systems has allowed HACC to support additional baryonic physics through a newly developed Smoothed Particle Hydrodynamics (SPH) formalism called Conservative Reproducing Kernel (CRK). This webinar will discuss the challenges faced in preparing HACC for multiple exascale systems while simultaneously adding additional code capabilities, with ongoing development, all the while with a central focus on performance.
About the Webinar Series
The HPC Best Practices webinars address issues faced by developers of computational science and engineering (CSE) software on high-performance computers (HPC). The sessions are independent, so join any or all.
Who should attend: Participation is free and open to the public, however registration is required for each event. This series is designed for HPC software developers who are seeking help in increasing their team’s productivity, as well as facility staff who interact extensively with users.
Schedule and format: The webinars will occur approximately monthly and last about one hour each. The webinars usually take place on a Wednesday at 1:00-2:00pm ET (but this can change due to speaker availability). Audience questions and discussion will be encouraged, however due to the number of participants, we use the webinar tool’s chat capability and a shared Google Doc to do this in written form. Recordings of the webinars along with the presentation slides will be posted.
Notifications: If you’d like to receive announcements of upcoming webinars and other IDEAS organized events, and followups when recordings become available, please subscribe to our mailing list.
Organizers: These webinars have been organized by the IDEAS project in collaboration with the DOE/ASCR computing facilities (ALCF, NERSC, and OLCF), and the Exascale Computing Project (ECP).
Suggestions Welcome! Want to request another topic? Want to give a webinar? Email us at IDEASProductivity@gmail.com.
Past Webinars
Listed in reverse chronological order.
2023
-
Simplifying Scientific Python Package Installation and Usage (2023-09-13)
- Presenter: Amiya Maji (Purdue University)
- Artifacts: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: With the growing popularity of Python, installation and management of Python packages in HPC environments is emerging as a critical problem for researchers; the problem is exacerbated by the need to provide consistency across traditional batch workloads and interactive notebooks. This webinar will discuss how to simplify scientific Python package installation by streamlining environment management, dependency tracking, and runtime customizations through easy-to-use tools. The webinar will discuss challenges for installing Python packages in HPC environments and present the best practices suggested by various HPC centers. Many of these best practices have been incorporated into a tool, conda-env-mod, developed by the speaker and his collaborators. HPC centers can further customize the tool and its module templates to incorporate additional software dependencies and provide descriptive help messages. The deployment of the tool has significantly reduced errors and enabled sharing of Python package installations among users. The webinar will give an overview of installing Python packages with
conda-env-mod
.
-
Infrastructure for High-Fidelity Testing in HPC Facilities (2023-08-09)
- Presenter: Ryan Prout (Oak Ridge National Laboratory)
- Artifacts: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: The Exascale Computing Project (ECP) is investing heavily in software for exascale systems, as can be seen in the many tools, libraries and software components within ECP. In order to boost software integration across computing facilities, ECP has developed infrastructure and tools for high-fidelity testing. This infrastructure is made accessible to ECP software technology developers to provide a trusted and efficient testing environment that employs continuous integration (CI). At the core of the ECP-enabled testing infrastructure is the Jacamar CI tool. This tool allows us to link multi-tenant HPC systems to Gitlab CI workflows. This webinar will provide an overview of the ECP testing infrastructure, discuss what this could look like post-ECP, and how it could benefit other HPC facilities.
-
Writing Clean Scientific Software (2023-07-12)
- Presenter: Nick Murphy (Center for Astrophysics, Harvard & Smithsonian)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Most scientists are largely self-taught as programmers. Even many of us who spend most of our time coding have never had formal training in writing software. This webinar is intended for students and scientists who have some experience writing code but who have had to learn mostly on their own. The webinar will describe tips and strategies on how to write readable, reusable, and maintainable code. These tips include writing short functions that do exactly one thing with no side-effects, and measuring the length of a variable name by the time needed to understand its meaning rather than by number of characters. The webinar will describe strategies for restructuring a complicated function into smaller and more manageable chunks, and provide tips on how to make the best use of comments and error messages. Overall, the webinar will embolden the Computational Science and Engineering (CS&E) community to think of code as communication.
-
The OpenSSF Best Practices Badge Program (2023-06-14)
- Presenter: Roscoe A. Bartlett (Sandia National Laboratories)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: The Linux Foundation’s OpenSSF Best Practices Badge Program represents an impressive collection of the open source community’s knowledge base for creating, maintaining, and sustaining robust, high quality, and (most importantly) secure open source software. At its foundation is a featureful “Badge App” website, which provides a database of projects that document what best practices they have adopted and supporting evidence. This set of best practices (along with the detailed documentation and supporting justifications for each item) also serves as an incremental learning tool and as a foundation for incremental software process and quality improvements efforts. The webinar will provide an overview of this effort and describe some of its surprising benefits. The webinar will also describe how the OpenSSF Best Practices Badge Program can be used to help continue the recent advances in software quality and sustainability efforts in the computational science and engineering community going forward.
-
Lessons Learned Developing Performance Portable QMCPACK (2023-05-10)
- Presenter: Paul Kent (Oak Ridge National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: During DOE’s Exascale Computing Project the open source QMCPACK code has been redesigned and reimplemented to run portably and performantly on multiple vendors GPUs as well as CPUs. The QMCPACK code implements Quantum Monte Carlo algorithms to predict the properties of materials with benchmark accuracy. The new implementation has now fully replaced the prior non-portable GPU solution. This webinar will outline some of the design considerations and new algorithms implemented both to run efficiently and to reduce burdens on the developers and maintainers. A key factor has been the adoption of modern development practices, including an extensive test suite. This has accelerated development, improved code quality, and also enabled isolation of problems in the wider HPC software stack, including in compilers and numerical libraries. The webinar will summarize these strategies and other recommendations for HPC application developers and facilities.
-
Facilitating Electronic Structure Calculations on GPU-based Exascale Platforms (2023-04-12)
- Presenter: Jean-Luc Fattebert (Oak Ridge National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: GPUs accelerators offer the prospect of speeding up ab initio molecular dynamics and other large-scale first-principles atomistic simulations. Taking advantage of these devices is, however, not a trivial task given their specificities. Some algorithms struggle, while others thrive with the high level of thread concurrency available on modern GPUs. The PROGRESS and BML libraries, developed within ECP’s Co-design Center for Particle Applications (CoPA) project, allow electronic structure codes to offload their most expensive kernels, with a unified interface for various matrix formats and computer architectures. The webinar will focus on implementations and algorithmic choices made in those libraries, and lessons learned while trying to achieve performance portability on exascale platforms. Specifically, the webinar will discuss eigensolvers and their alternatives, as well as strong scaling in fast time-to-solution in molecular dynamics.
-
Our Road to Exascale: Particle Accelerator & Laser-Plasma Modeling (2023-03-15)
- Presenter: Axel Huebl (Lawrence Berkeley National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Particle accelerators, among the largest, most complex devices, demand increasingly sophisticated computational tools for the design and optimization of the next generation of accelerators that will meet the challenges of increasing energy, intensity, accuracy, compactness, complexity and efficiency. It is key that contemporary software take advantage of the latest advances in computer hardware and scientific software engineering practices, delivering speed, reproducibility and feature composability for the aforementioned challenges. The webinar will discuss the experience of the developers of WarpX in the US DOE Exascale Computing Project (ECP), which led to the 2022 ACM Gordon Bell Prize. Including the first Exascale supercomputer Frontier, WarpX uses GPUs and CPUs at massive scale; research efforts have advanced particle-in-cell algorithms such as dynamic load balancing, block-structured mesh-refinement, and modern relativistic Maxwell solvers. The webinar will present strategies and results in performance portability. In particular, the webinar will discuss the team-of-teams approach for software co-design in AMReX, software architecture, quality assurance, developer & user productivity, and ecosystem interplay that has lifted up accelerator modeling activities to be fast, open, modular and sustainable over the long term.
-
Openscapes: supporting better science for future us (2023-01-11)
- Presenter: Julia Stewart Lowndes (Openscapes)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Openscapes champions open practices in environmental science to help uncover data-driven solutions faster. In this webinar the speaker will share how she transitioned from doing her own marine ecology research to founding Openscapes to support other researchers and grow the global Open Science movement. The speaker will share lessons learned from her work mentoring government, non-profit, and academic environmental and Earth teams, with specific stories from projects with NASA and NOAA Fisheries. The webinar will reuse parts of a recent keynote at RStudio::conf that was the global launch of Quarto, a new, open-source, scientific and technical publishing system. The webinar will include a demo on some features of Quarto for R and Python users and highlight how more reusing and less reinventing is critical for science. The speaker will also discuss how open source/science is a daily practice, and an important avenue to increase inclusion in science and contribute to the climate movement.
2022
-
Lab Notebooks for Computational Mathematics, Sciences & Engineering (2022-12-14)
- Presenter: Jared O’Neal (Argonne National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: As computational mathematics, science, and engineering problems become larger, more ambitious, and more complex, it is increasingly important to develop and use tools and techniques that ensure that computational research is based on a strong foundation of general, low-level scientific best practices. In this webinar, the speaker will relate his experience of transitioning from working in the worlds of experimental and observational sciences to the world of computational sciences as well as his experience adapting experimental tools and techniques to computational research. In particular, the speaker will focus on the role of lab notebooks in experimental sciences and present concrete examples to address the challenges associated with adapting lab notebooks to computational research.
-
Managing Academic Software Development (2022-11-09)
- Presenter: Sam Mangham (University of Southampton)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Developing academic software can be an unusual exercise, especially compared to traditional software development. The goals and inputs can be undefined and fluctuating, whilst the code itself has traditionally been a stepping stone – a byproduct on the way to papers, ending up ad-hoc, unplanned and undocumented. Fortunately, things are changing. There are tools and techniques that make it easier to design, use, distribute and cite scientific software. This webinar discusses approaches to managing the development and release of academic software, ranging from coding best practices and project boards, to development environments and automated documentation that can help you write sustainable code that is easy to use, cite and collaborate with and on.
-
Investing in Code Reviews for Better Research Software (2022-10-12)
- Presenters: Thibault Lestang (Imperial College London), Dominik Krzemiński (University of Cambridge), and Valerio Maggio (Software Sustainability Institute)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Code review is a development practice that improves readability and maintainability of software projects, in addition to making collaboration easier and teamwork more effective. Typically, code review is a conversation between reviewer(s) and the author(s) of the code under review. The code is dissected and analyzed in order to find areas of improvement according to the focus of the review. Examples include, but are not limited to, readability, security or performance improvements. Despite code review being an effective tool for improving software quality, it is still not a standard practice within the scientific software development process. The webinar will detail the benefits that code review can bring to scientific software developers, particularly improvements in software quality, improved teamwork and knowledge transfer. The presenters will highlight common difficulties faced by researchers to set up, perform and maintain frequent code reviews, and they will discuss several approaches and good practices to mitigate these difficulties. The presenters will also describe common tools that make code reviews easier and give examples of how to use them effectively, while explaining a typical code development cycle with continuous integration and automatic code checks.
-
Software Packaging (2022-09-07)
- Presenter: David Rogers (Oak Ridge National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: The ability to “import” a package is the critical enabling technology for software re-use. As a package developer, there are a variety of standards and tools we can adopt to make importing our work easier for our users. This webinar surveys packaging technologies and ideas popular in scientific software (C++, python, and Fortran with autoconf, cmake, python builds, spack, and containers). Good re-usability is a product of thoughtful program structure, build process, version control, and testing. By examining some real-world examples, we show how these steps build on each other in “live” projects to make easy connections between software deployment and package use.
-
Effective Strategies for Writing Proposal Work Plans for Research Software (2022-08-10)
- Presenter: Chase Million (Million Concepts)
- Archives: Recording (YouTube) | Slides (PDF)
- Description: Effective research proposals must persuade review panels that the project objectives can be achieved and that the requested resources are reasonable and sufficient resources for doing so. A clear, plausible work plan is central to this persuasive process. Despite the fact that many research projects require a great deal of software development, the true costs of software development tasks are often underappreciated and underestimated by both proposers and reviewers. Accurately judging and communicating these costs leads to better proposal and project outcomes. We will quickly survey software project scoping, requirements elicitation, and estimation methods appropriate for the pre-proposal phase, then explain how these can be used to generate a strong and convincing work plan. Topics will include vision and scope, concept of operations, and requirements specification documents; work breakdown structures; requirements / task matrices; and Gantt charts. Strategies for maximizing the impact of these artifacts within a research proposal will be discussed, with suggestions for further reading.
-
Growing preCICE from an as-is Coupling Library to a Sustainable, Batteries-included Ecosystem (2022-07-06)
- Presenter: Gerasimos Chourdakis (Technical University of Munich)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Starting humbly as a coupling library for fluid-structure interaction problems used by just a few academic groups in Germany, preCICE has grown to a complete coupling ecosystem used by more than 100 research groups worldwide, and for a wide range of multi-physics applications. How did that happen? Apart from the library itself, preCICE now maintains ready-to-use adapters for several open-source solvers, tutorial cases, documentation, and more. Users can thus easily couple popular open-source solvers (such as OpenFOAM, SU2, deal.II, or FEniCS) with their in-house simulation software (written in C++, C, Fortran, Python, Matlab, or Julia). In parallel to this, the developers of preCICE had to learn how to write more effective documentation (avoiding fragmentation and getting the user in the loop), how to manage the rapidly growing community (switching from a mailing list to a chatroom and then to a dedicated Discourse forum), and how to organize workshops and training courses. This webinar will focus on lessons learned that can help any research software project grow in a sustainable way.
-
Normalizing Inclusion by Embracing Difference (2022-06-15)
- Presenter: Mary Ann Leung (Sustainable Horizons Institute)
- Archives: Recording (YouTube) | Slides (PDF) | Chat Transcript (TXT)
Description: Computational science and engineering (CSE) is an inter- and multidisciplinary field. Given the technical breadth of CSE, one might expect CSE communities to include a broad range of demographics, creating an ideal ecosystem for diversity, equity, and inclusion (DEI). However, while research indicates that social diversity results in greater innovation, the CSE workforce remains largely homogeneous. This interactive webinar will explore what it takes to achieve DEI, how DEI could increase innovation and developer productivity, as well as how cultivating respect and embracing difference could help to make inclusion the norm. The session will also include important activities for applying the concepts discussed, deepening understanding, and increasing potential impact.
This webinar is co-organized with the ECP’s newly established HPC Workforce Development and Retention Action Group, which organizes a webinar series on topics related to developing a diverse, equitable, and inclusive work culture in the computing sciences.
-
Acquisition and Analysis of Times Series of Satellite Data in the Cloud – Lessons from the Field (2022-05-11)
- Presenter: Marisol Garcia-Reyes (Farallon Institute)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Satellite data has grown and matured to levels that allow powerful and relevant analysis in climate science, which requires time series spanning decades. Acquiring such data has been a technical and coding challenge given the historical formats in which data is stored, and analyzing the data has required high levels of coding expertise. With technological advances, like the coding language Python and new storage and process capabilities available in the cloud, there is great potential to increase the use of satellite data in new and diverse research areas. This requires, however, expanding the user base by building capacity in groups with limited coding or technological expertise. A challenge is the steep learning curve for these new technological advances, which can be intimidating and discouraging. To provide a taste of the new technologies and opportunities they provide, the presenter has developed a tutorial to teach potential new users how to acquire, synthesize and analyze satellite and satellite-based time series of data, while learning and using Python and cloud advances in the process. In this webinar, the speaker will share the lessons learned in making and teaching the tutorial, which can be found at https://github.com/marisolgr/python_sat_tutorials.
-
Evaluating Performance Portability of HPC Applications and Benchmarks Across Diverse HPC Architectures (2022-04-13)
- Presenter: JaeHyuk Kwack (Argonne Leadership Computing Facility)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: As HPC communities move into the exascale era, GPU-accelerated systems become one of the primary HPC architectures, and major processor vendors proactively lead technical innovation in the GPU ecosystem. The U.S. DOE has successfully supported this transformation to the next generation of HPC infrastructure through the Exascale Computing Project (ECP). NVIDIA has played a leading role to deploy multiple pre-exascale GPU systems (Summit at OLCF, Sierra at LLNL, Perlmutter at NERSC, and Polaris at ALCF). AMD and Intel are playing critical roles in developing exascale GPU systems, such as Frontier at OLCF, Aurora at ALCF, and El Capitan at LLNL. Simultaneously with the dynamic shifts in hardware, application developer communities have endeavored to maintain or increase their scientific throughputs by adopting performance portable programming models or frameworks, and it turns out a smooth transition is one of the necessary conditions to maintain productivity. In this webinar, the speaker will evaluate the progress being made on achieving performance portability by a subset of ECP applications or their related mini-apps, and approaches to achieving performance portability across diverse HPC architectures including AMD, Intel, and NVIDIA GPUs.
-
Software Design Patterns in Research Software with Examples from OpenFOAM (2022-03-09)
- Presenter: Tomislav Maric (Technische Universität Darmstadt)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Combining sub-algorithms to develop robust, scalable, and convergent numerical methods carries with itself a high level of uncertainty. Extensive automatic testing reduces this uncertainty for methods whose properties cannot be proven mathematically in all application scenarios – basically, most numerical methods. Methods with a more solid theoretical basis still require extensive testing since the jump between theory and practice is often challenging. The ability to select numerical sub-algorithms and combine them easily at runtime, speeds up research immensely. Software design patterns already very successfully address the requirements of runtime selection and algorithm combinations and are staples of modern software engineering. This webinar covers a handful of beneficial software design patterns that provide a solid basis for developing numerical methods in a modular way – drawing concrete examples from OpenFOAM, a highly modular open-source software for Computational Fluid Dynamics.
-
Wrong Way: Lessons Learned and Possibilities for Using the “Wrong” Programming Approach on Leadership Computing Facility Systems (2022-02-16)
- Presenter: Philip Roth (Oak Ridge National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF)
- Description: Large scale computing systems such as those deployed and being deployed at U.S. Department of Energy computing facilities rely greatly on compute accelerators (currently graphics processing units, GPUs) for their performance potential. Each of these systems has a small number of natural approaches for representing the code that runs on these accelerators. For instance, for the Oak Ridge Leadership Computing Facility’s Frontier system, the natural approaches include the Heterogeneous-Compute Interface for Portability (HIP) and OpenMP with target offload. But it is often interesting, and sometimes even useful, to consider the impact of using a “wrong” programming approach for a given system. In this webinar, the speaker will present a few of these “wrong” programming approaches for current and near-term future systems, including a discussion of the specific software packages that enable the approach, and lessons learned in cases where the approach has been attempted.
2021
-
Scientific software ecosystems and communities: Why we need them and how each of us can help them thrive (2021-12-08)
- Presenter: Lois Curfman McInnes (Argonne National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF)
- Description: HPC software is a cornerstone of long-term collaboration and scientific progress, but software complexity is increasing due to disruptive changes in computer architectures and the challenges of next-generation science. Thus, the HPC community has the unique opportunity to fundamentally change how scientific software is designed, developed, and sustained—embracing community collaboration toward scientific software ecosystems, while fostering a diverse HPC workforce who embody a broad range of skills and perspectives. This webinar will introduce work in the U.S. Exascale Computing Project, where a varied suite of scientific applications builds on programming models and runtimes, math libraries, data and visualization packages, and development tools that comprise the Extreme-scale Scientific Software Stack (E4S). The webinar will introduce crosscutting strategies that are increasing developer productivity and software sustainability, thereby mitigating technical risks by building a firmer foundation for reproducible, sustainable science. The webinar will also mention complementary community efforts and opportunities for involvement.
-
55+ years in High-Performance Computing: One Woman’s Experiences and Perspectives
(2021-11-10)- Presenter: Jean Shuler (Lawrence Livermore National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF)
- Description: This HPC webinar will differ from others in the series. We will have a Q&A session with Jean Shuler, a woman who has worked at the leading edge of High-Performance Computing for more than 55 years. Jean graduated with a degree in Mathematics from William and Mary in 1963 and taught herself programming on the job at NASA Langley. By 1972, she came to LLNL where she has worked ever since. She initially worked on early data storage and graphics systems. Challenges in learning to use computing center resources gave Jean a passion for helping others find their way in HPC. She eventually led User Services for the National Energy Research Scientific Computing (NERSC) Center. This role took Jean all over the world contributing to Cray User Group meetings. When NERSC moved from LLNL in 1996, Jean created and led the User Services Group for Livermore Computing. Throughout her career, Jean has supported various HPC systems from CDC, Cray, Meiko, and IBM on the march to Exascale. If you have an interest in computing history, the experiences and impact of women in computing or if you are early in your career and looking for some inspiration, you will want to attend this webinar and listen to Jean’s amazing career and stories.
-
Migrating to Heterogeneous Computing: Lessons Learned in the Sierra and El Capitan Centers of Excellence (2021-10-13)
- Presenter: David Richards (Lawrence Livermore National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: The introduction of heterogeneous computing via GPUs from the Sierra architecture represented a significant shift in direction for computational science at Lawrence Livermore National Laboratory (LLNL), and therefore required significant preparation. The Sierra Center of Excellence (COE) brought employees with specific expertise from IBM and NVIDIA together with LLNL in a concentrated effort to prepare applications, system software, and tools for the Sierra supercomputer. To prepare for El Capitan, a new COE is currently operating in collaboration with HPE and AMD. This webinar will describe the operation of these COEs and document lessons learned, with the hope that others will be able to learn from both our success and intermediate setbacks. We describe what we have found to be best practices for managing the vendor collaborations, migrating algorithms and source code, working with the system software stack and tools, and optimizing application performance.
-
What I Learned from 20 Years of Leading Open Source Projects
(2021-09-15)- Presenter: Wolfgang Bangerth (Colorado State University)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Scientific software has grown from loose collections of individual routines working on relatively simple data structures to very large packages of 100,000s to millions of lines of code, with dozens of contributors, and hundreds or thousands of users. In the process, the approaches to software development have also drastically changed: both the software packages as well as their development are professionally managed, with version control, extensive test suites, and automatic regression checks for every patch. Maybe more interestingly, the approaches to managing the community software developers and users have also dramatically changed. Having led two large, open source software projects (the finite element package deal.II, and the Advanced Simulator for Problems in Earth ConvecTion ASPECT) for more than 20 years, the presenter will share lessons learned about both the technical management of scientific software projects, as well as the social side of these projects.
-
Software Engineering Challenges and Best Practices for Multi-Institutional Scientific Software Development (2021-08-04)
- Presenter: Keith Beattie (Lawrence Berkeley National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Scientific software is increasingly becoming the backbone of obtaining and validating scientific results. This is no longer just the case for traditionally computationally intensive areas but is now true across a wide variety of scientific disciplines. This circumstance elevates how scientific software is developed, independent of the field, to a new level of importance. Further, the multi-institutional nature of many science projects presents unique challenges to how scientific software can be effectively developed and maintained over the long term. In this webinar we present the challenges faced in leading the development of scientific software across a distributed, multi-institutional team of contributors, and we describe a set of best-practices we have found to be effective in producing impactful and trustworthy scientific software.
-
Mining Development Data to Understand and Improve Software Engineering Processes in HPC Projects (2021-07-07)
- Presenter: Boyana Norris (University of Oregon)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: The webinar will explore the role of software-related data mining tools in supporting productive development of high-performance scientific software. The webinar will discuss a variety of existing and emerging tools for analyzing code, git, emails, issues, test results, and dependencies, with the long-term goal of improving the understanding of development processes and enhancing developer productivity. The webinar will include specific analysis examples by applying a subset of those tools to ECP projects.
-
Using the PSIP Toolkit to Achieve Your Goals – A Case Study at The HDF Group (2021-06-09)
- Presenters: Elena Pourmal (The HDF Group), Reed Milewicz (Sandia National Laboratories), and Elsa Gonsiorowski (Lawrence Livermore National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Productivity and Sustainability Improvement Planning (PSIP) is a lightweight, iterative workflow that allows software development teams to identify development bottlenecks and track progress toward goals to overcome them. In this talk, we present an overview of the PSIP methodology and toolkit, and describe how the HDF5 Group used PSIP to make improvements in three key areas of their software development process.
-
Automated Fortran–C++ Bindings for Large-Scale Scientific Applications (2021-05-12)
- Presenter: Seth Johnson (Oak Ridge National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Although many active scientific codes use modern Fortran, most contemporary scientific software libraries are implemented in C and C++. Providing their numerical, algorithmic, or data management features to Fortran codes requires writing and maintaining substantial amounts of glue code. In the same vein, some projects are actively moving key kernels from Fortran toward C++ to support performance portability models and other rapidly-developing, dynamic programming paradigms. How can a project smoothly connect existing Fortran code to new internal C++ kernels or external C++ libraries? SWIG-Fortran provides a solution with a wide range of flexibility, including support for performant data transfers, MPI support, and direct translation of C++ features to Fortran interfaces.
-
A Workflow for Increasing the Quality of Scientific Software (2021-04-07)
- Presenter: Tomislav Maric (Technische Universität Darmstadt)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: The webinar will present a workflow that increases the quality of research software in Computational Science and Engineering (CSE) by applying established software engineering practices extended with CSE-specific testing and visualization, and periodical cross-linking of software with reports/publications and datasets. The workflow is minimalistic. It introduces a small amount of work overhead, which is crucial for research groups without dedicated funding for ensuring the quality of research software and reproducibility of scientific results.
-
An Overview of the RAJA Portability Suite (2021-03-10)
- Presenter: Arturo Vargas (Lawrence Livermore National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: The RAJA Portability Suite is a collection of open-source software libraries that enable developers to write single-source applications that are portable across a wide range of HPC architectures. The Suite contains tools for portable loop execution (RAJA) and memory management (Umpire and CHAI). The development of the Suite is motivated by the needs of production multiphysics codes, which must run efficiently on laptops, commodity clusters, and massively parallel advanced technology systems at any point in time as well as across multiple platform generations. The scale and complexity of these applications requires that they be able to employ system-appropriate native programming models, such as OpenMP, CUDA, and HIP, without significant source code modification. The abstractions that the RAJA Portability Suite provides enable such portable single-source application development. The Suite is used in a diverse range of production codes at Lawrence Livermore National Laboratory (LLNL). It is also funded as a Software Technology Project in DOE’s Exascale Computing Project, where the Suite supports a number of key applications. The webinar will provide an overview of the Suite and its capabilities and discuss status and plans to support applications on exascale platforms. The webinar will present code examples that illustrate basic usage and compare to programming with native programming models, and performance results for several applications that rely on the Suite for platform portability. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-ABS-81860
-
Good Practices for Research Software Documentation (2021-02-10)
- Presenters: Stephan Druskat (Friedrich Schiller University Jena), and Sorrel Harriet (Leeds Trinity University)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: This webinar aims to introduce the importance of software documentation and the different approaches that may be taken at various stages, and on various levels, in the software development life cycle. Through the sharing of examples and stimulative questions, the speakers aim to encourage the audience to reflect on the relationship between documentation and process, and to make informed choices about when and how to document their software.
-
Extreme-scale Scientific Software Stack (E4S) (2021-01-13)
- Presenters: Sameer Shende (University of Oregon and ParaTools), and David Honegger Rogers (Los Alamos National Laboratory)
- Archives: Recording (YouTube) | Slides Part 1 (PDF) | Slides Part 2 (PDF) | Q&A (PDF)
- Description: With the increasing complexity and diversity of the software stack and system architecture of high performance computing (HPC) systems, the traditional HPC community is facing a huge productivity challenge in software building, integration and deployment. Recently, this challenge has been addressed by new software build management tools such as Spack that enable seamless software building and integration. Container based solutions provide a versatile way to package software and are increasingly being deployed on HPC systems. The DOE Exascale Computing Project (ECP) Software Technology focus area is developing an HPC software ecosystem that will enable the efficient and performant execution of exascale applications. Through the Extreme-scale Scientific Software Stack (E4S), it is developing a curated, Spack-based, comprehensive and coherent software stack that will enable application developers to productively write highly parallel applications that can portably target diverse exascale architectures. E4S provides both source builds through the Spack platform and a set of containers that feature a broad collection of HPC software packages. E4S exists to accelerate the development, deployment, and use of HPC software, lowering the barriers for HPC and AI/ML users. It provides container images, build manifests, and turn-key, from-source builds of popular HPC software packages developed as Software Development Kits (SDKs). This effort includes a broad range of areas including programming models and runtimes (MPICH, Kokkos, RAJA, OpenMPI), development tools (TAU, PAPI), math libraries (PETSc, Trilinos), data and visualization tools (Adios, HDF5, Paraview), and compilers (LLVM), all available through the Spack package manager. The webinar will describe the community engagements and interactions that led to the many artifacts produced by E4S, and will introduce the E4S containers that are being deployed at the HPC systems at DOE national laboratories.The presenters will discuss the recent efforts and techniques to improve software integration and deployment for HPC platforms, and describe recent collaborative work on reproducible workflows between E4S and the Pantheon project. Pantheon provides a set of working examples of end-to-end workflows using ECP apps, infrastructure and postprocessing, focused on common vis/analysis operations and workflows of interest to application scientists and show a video of the workflow.
2020
-
Software Design for Longevity with Performance Portability (2020-12-09)
- Presenter: Anshu Dubey (Argonne National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: In the era of simultaneously increasing heterogeneity in hardware and application software, the topics of performance portability and longevity may seem at cross purposes. Key to achieving either objective individually is software design. Achieving both simultaneously is a much harder challenge, yet, in today’s scientific computing landscape neither objective can be ignored. Questions that science is posing to computation are more complex, which imply greater investment in building science capabilities in the software, and therefore longevity is important. Those questions need more capable hardware, which can be obtained only through evermore heterogeneous platforms. This webinar will present a few basic principles of scientific software design that have been instrumental in mitigating some of the challenges that applications developers are facing. These principles represent a combination of experience from the presenter’s own project and from the Exascale Computing Project Performance Portability Panel Series that took place during summer of 2020.
-
Reducing Technical Debt with Reproducible Containers
(2020-11-04)- Presenter: Tanu Malik (DePaul University)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Computational experiments can be challenging to reproduce; researchers have to choose between pursuing a fast-paced research agenda and developing well-organized, sufficiently documented, and easily reproducible software. Like incurring fiscal debt, there are often tactical reasons to take on technical debt in scientific software—such as deferring documentation, organization, refactoring, and unit tests when pursuing a new idea or meeting a conference deadline. However, more often than not, researchers do not repay this technical debt, leading to irreproducible experiments.The webinar will describe different levels of technical debt and quantify the cost of not repaying the technical debt. The presenter will introduce isolation in containers as a powerful mechanism for reducing portability debt and describe limitations of current container tools. The presenter will introduce a vision of a reproducible container that aims to automate repayment of different types of technical debt, and will describe the current state of this vision with three tools that use isolation, encapsulation, and monitoring to include necessary and sufficient content in the container—both in terms of software and data, and describe the contents of the container. Finally, the presenter will show results of using reproducible containers on domain science and HPC use cases, and provide guidance.
-
Scalable Precision Tuning of Numerical Software (2020-10-14)
- Presenter: Cindy Rubio-Gonzalez (University of California, Davis)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: The use of numerical software has grown rapidly over the past few years, providing the foundation for a large variety of applications including scientific software and machine learning. Given the variety of numerical errors that can occur, floating-point programs are difficult to write, test and debug. One common practice among developers is to use the highest available precision when allocating variables. While more robust, this can degrade program performance significantly. This webinar describes our research on developing tools to assist programmers in tuning the precision of their floating-point programs. These tools conduct a data-driven approach to search over the types of floating-point variables to lower their precision subject to accuracy constraints and performance goals. In the last part of the webinar, I will discuss challenges and opportunities for scalable precision tuning of large HPC applications.
-
Testing and Code Review Practices in Research Software Development (2020-09-09)
- Presenter: Nasir Eisty (California Polytechnic State University, San Luis Obispo)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Software quality in a research context is essential because research software is used in mission-critical situations, decision making, and computation of evidence for research publications. This webinar will cover the use of two software quality practices in the development of research software: software testing and peer code review. These practices in software development can lead to both improved scientific results through higher quality software in the short term and more maintainable software in the long term. While these practices are essential for any type of software, developers of research software typically do not use peer code review and software testing as frequently as they could for maximum impact. The presenter will discuss the motivation, challenges, barriers, and necessary improvements to make the practices effective for research software development, based on studies of the research software community conducted via interviews, surveys, workshops, and tutorials.
-
Colormapping Strategies for Large Multivariate Data in Scientific Applications (2020-08-12)
- Presenter: Francesca Samsel (Texas Advanced Computing Center)
- Archives: Recording (YouTube) | Slides (PDF)
- Description: In order for scientific visualizations to effectively convey insights of computationally-driven research, as well as to better engage the public in science, visualizations must effectively and affectively facilitate the exploration of information. The presenter and her team employ a transdisciplinary approach, that includes insights from artistic color theory, perceptual science, the visualization community, and domain scientists, to move beyond basic default colormaps. While color has always been utilized and studied as a component of scientific data visualization, it has been demonstrated that its full potential for discovery and communication of scientific data remains untapped. The webinar will discuss how effective color use can reveal structures, relationships, and hierarchies between variables within a visualization, as well as practical strategies and workflows for tailor color application to the goals of the visualization. The presenter’s work is documented and freely available for use at SciVisColor.org, a hub for research and resources related to color in scientific visualization. SciVisColor provides tools and strategies that allow scientists to use color as a tool to better understand and communicate their data. Users can explore and download colormaps, color sets, and ColorMoves an interactive interface for using color in scientific visualization above. The webinar will introduce concepts that can help developers make design decisions when writing simulation codes, to make better use of scientific visualization tools and visualize results more effectively.
-
What’s New in Spack?
(2020-07-15)- Presenter: Todd Gamblin (Lawrence Livermore National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Spack is a package manager for scientific computing, with a rapidly growing open source community. With over 500 contributors from academia, industry, and government laboratories, Spack has a wide range of use cases, from small-scale development on laptops and clusters, to software release management for the U.S. Exascale Computing Project, to user software deployment on 6 of the top 10 supercomputer sites in the world.Spack isn’t just for facilities, though! As a package manager, Spack is in a powerful position to impact DevOps and daily software development workflows. Spack has virtual environments that enable the “manifest and lock” model popularized by more mainstream dependency management tools. New releases of Spack include direct support for creating containers and gitlab CI pipelines for building environments. This webinar will cover new features as well as the near- and long-term roadmap for Spack.
-
SYCL – Introduction and Best Practices (2020-06-17)
- Presenter: Thomas Applencourt (Argonne National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: SYCL is a single-source heterogeneous programming model based on standard C++. It uses C++ templates and lambda functions for host and device code. SYCL builds on the underlying concepts of portability and efficiency of OpenCL that enable code for heterogeneous processors; however, it is less verbose than OpenCL. The single-source programming enables the host and kernel code for an application to be contained in the same source file, in a type-safe way and with the simplicity of a cross-platform asynchronous task graph. We will provide an overview of the SYCL concepts, compilation, and runtime. No prior knowledge of OpenCL is required for this webinar. Once we have reviewed the core concepts of SYCL, we will walk through several code examples to highlight its key features and illustrate best practices. SYCL by design is hardware agnostic and offers the potential to be portable across many of DOE’s largest machines.
-
Accelerating Numerical Software Libraries with Multi-Precision Algorithms (2020-05-13)
- Presenters: Hartwig Anzt (Karlsruhe Institute of Technology), and Piotr Luszczek (University of Tennessee)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: With the rise of machine learning, more hardware manufacturers are introducing low-precision special function units in processor designs, often achieving up to an order or magnitude higher performance than in the IEEE double precision that is typically used as working precision in scientific computing. At the same time, a rapidly expanding landscape of mixed- and multi-precision methods generate high-quality solutions that leverage the higher compute power of reduced precision. This webinar will introduce the concept of floating point formats and the IEEE standard. We will demonstrate how using an iterative or direct solver in lower precision impacts the solution quality. We will outline several strategies that aim to preserve numerical stability and high solution quality while still computing, at least partially, in lower precision. We will present several multi-precision algorithms that have proven particularly successful and elaborate on their realization and usage. We also will introduce open source production-quality multi-precision software packages and show their integration and efficiency for scientific applications. The webinar will focus on lessons learned and generally applicable strategies.
-
Best Practices for Using Proxy Applications as Benchmarks (2020-04-15)
- Presenters: David Richards (Lawrence Livermore National Laboratory), and Joe Glenski (Hewlett-Packard Enterprise)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Proxy applications have many uses in software development and hardware/software co-design. Because most proxies are easy to build, run, and understand, they are especially appealing for use in benchmark suites and studies. This webinar will examine the role of proxy apps as benchmarks and explain why run rules and a figure of merit are essential for a proxy application to function as an effective benchmark. We will show how to evaluate the fidelity of benchmarks as a model for actual workloads and provide tips on creating problem specifications and other run rules. We will discuss what DOE facilities are looking for when they assemble benchmark suites for use in procurements. Finally, we will explain how system vendors use our benchmark suites and what practices they view as most (and least) effective.
-
Testing: Strategies When Learning Programming Models and Using High-Performance Libraries (2020-03-18)
- Presenter: Balint Joo (Jefferson Lab)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Software testing is an invaluable practice, albeit the level of testing in scientific applications can vary widely, from no testing at all to full continuous integration (as discussed in earlier webinars of the HPC-BP series). In this webinar I will consider a specific case: the use of unit-testing when developing a mini-app as an approach to learn about new programming models such as Kokkos and SYCL, or when using (or contributing to) high-performance libraries. I will illustrate with an example from Lattice QCD, focusing on the integration of the QUDA optimized library with the Chroma application. The webinar will focus on lessons learned and generally applicable strategies.
-
Introduction to Kokkos (2020-02-19)
- Presenter: Christian Trott (Sandia National Laboratories)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: The Kokkos C++ Performance Portability Ecosystem is a production-level solution for writing modern C++ applications in an hardware-agnostic way. It is part of the US Department of Energy’s Exascale Computing Project—the leading effort in the US to prepare the HPC community for the next generation of supercomputing platforms. Kokkos is now used by more than a hundred HPC projects, and Kokkos-based codes are running regularly at-scale on at least five of the top ten supercomputers in the world. In this webinar, we will give a short overview of what the Kokkos Ecosystem provides, including its programming model, math kernels library, tools, and training resources, before providing an overview of the Kokkos team’s efforts surrounding the ISO-C++ standard, and how Kokkos both influences future standards and aligns with developments occurring in them. The webinar will include a status update on the progress in supporting the upcoming exascale class HPC systems announced by DOE.
-
Refactoring EXAALT MD for Emerging Architectures (2020-01-15)
- Presenters: Aidan Thompson (Sandia National Laboratories), Stan Moore (Sandia National Laboratories), and Rahulkumar Gayatri (National Energy Research Scientific Computing Center)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: As part of the DOE Exascale Computing Project, members of the EXAALT project are working to increase the accuracy, time, and length scales of molecular dynamics simulations of materials for fusion energy. Simulations rely on the SNAP machine-learning interatomic potential to accurately capture material properties. The SNAP kernel recursively evaluates a set of complex polynomial functions, requiring many deeply nested loops with irregular loop bounds. Last year, a worrisome trend in the SNAP force kernel was identified. With each new generation of emerging architectures, performance relative to theoretical peak was decreasing, particularly on GPUs. This webinar will discuss the approach used to rewrite the SNAP kernel from the ground up, using more compact memory representation, refactoring the main loop, using sub-kernels to reduce pressure on GPU threads, and improving coalesced memory accesses on the GPU. This work has enabled a spectacular increase of roughly 10x in performance over the baseline implementation of the SNAP benchmark running on NVIDIA V100 GPUs. Extrapolated to the full machine, this predicts an increase of over 100x in the Figure of Merit over the baseline on the ALCF/Mira system, putting EXAALT on track to meeting, and even exceeding performance targets on exascale systems. The webinar will emphasize key strategies and lessons learned in code transitions for emerging architectures.
2019
-
Building Community through xSDK Software Policies (2019-12-11)
- Presenters: Ulrike Meier Yang (Lawrence Livermore National Laboratory), and Piotr Luszczek (University of Tennessee)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: The development of increasingly complex computer architectures and software ecosystems continues. Applications that incorporate multiphysics modeling as well as the coupling of simulation and data analytics increasingly require the combined use of software packages developed by diverse, independent teams throughout the HPC community. The Extreme-scale Scientific Software Development Kit (xSDK) is being developed to provide coordinated infrastructure for independent mathematical libraries to support the productive and efficient development of high-quality applications. This webinar will discuss the development and impact of xSDK community policies, which constitute an integral part of the project and have been defined to achieve improved code quality and compatibility across xSDK member packages and a sustainable software ecosystem.
-
Tools and Techniques for Floating-Point Analysis (2019-10-16)
- Presenter: Ignacio Laguna (Lawrence Livermore National Laboratory)
- Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
- Description: Scientific software is central to the practice of research computing. While software is widely used in many science and engineering disciplines to simulate real-world phenomena, developing accurate and reliable scientific software is notoriously difficult. One of the most serious difficulties comes from dealing with floating-point arithmetic to perform numerical computations. Round-off errors occur and accumulate at all levels of computation, while compiler optimizations and low-precision arithmetic can significantly affect the final computational results. With accelerators such as GPUs dominating high-performance computing systems, computational scientists are faced with even bigger challenges, given that ensuring numerical reproducibility in these systems poses a very difficult problem. This webinar provides highlights from a half-day tutorial discussing tools that are available today to analyze floating-point scientific software. We focus on tools that allow programmers to get insight about how different aspects of floating-point arithmetic affect their code and how to fix potential bugs.
-
Discovering and Addressing Social Challenges in the Evolution of Scientific Software Projects (2019-09-11)
- Presenter: Rene Gassmoeller (UC Davis)
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: In recent years scientific software projects have increasingly incorporated state-of-the-art technical best practices like continuous integration into their development cycle. However, many projects still struggle to create and maintain an active and welcoming user/developer community, and there exists little documentation on what makes a scientific software community successful. In this webinar I will introduce my work — as a Better Scientific Software Fellow — to collect typical social challenges and potential solutions that arise during the evolution of a scientific software project. Aimed at current and prospective software maintainers and community leaders, I will discuss topics such as building and maintaining a welcoming community atmosphere, overcoming skepticism of sharing science and software, mediating between users working on conflicting topics or publications, and providing credit and growth opportunities for community members. Finally, I hope to initiate a conversation among project and community leaders about what makes communities successful so that we can learn from each other and improve scientific software development together.
-
Software Management Plans in Research Projects (August 2019)
- Presenter: Shoaib Sufi (Software Sustainability Institute)
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: Software is a necessary by-product of research. Software in this context can range from small shell scripts to complex and layered software ecosystems. Dealing with software as a first class citizen at the time of grant formulation is aided by the development of a Software Management Plan (SMP). An SMP can help to formalize a set of structures and goals that ensure your software is accessible and reusable in the short, medium and long term. SMP’s aim at becoming for software what Data Management Plans (DMP’s) have become for research data (DMP’s are mandatory for National Science Foundation grants). This webinar takes you through the questions you should consider when developing a Software Management Plan, how to manage the implementation of the plan, and some of the current motivation driving discussion in this area of research management.
-
When 100 Flops/Watt was a Giant Leap: The Apollo Guidance Computer Hardware, Software and Application in Moon Missions (July 2019)
- Presenter: Mark Miller (Lawrence Livermore National Laboratory)
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: Commemorating the 50th Anniversary of the Apollo Moon landings, this webinar will describe the revolutionary computer, the Apollo Guidance Computer (AGC). The AGC made autonomous travel to the Moon and back not only possible but added profoundly to crew safety, flight profile accuracy and even optimized propellant use to such an extent that final missions plans traded fuel for added weight in equipment and lunar samples. The webinar will give an overview of the AGC hardware architecture, the guidance software it executed as well as the pioneering efforts in developing both. HPC/CSE code teams will discover many familiar themes such as flops/watt power constraints and performance portability challenges. The webinar will conclude with several user stories about the actual operation of the AGC in various Apollo missions.
-
Modern C++ for High-Performance Computing (June 2019)
- Presenter: Andrew Lumsdaine (PNNL and University of Washington)
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: Since its creation by Bjarne Stroustrup in the early 1980s, C++ has steadily evolved to become a multi-paradigm programming language that fully supports the needs of modern programmers. Because C++ had its roots in the C programming language, conventional wisdom (and longstanding practice) had been to use C++ in a dichotomous fashion: abstractions for productivity with escape to C for performance. However, C++ today is best viewed holistically — as it is today — rather than as extension of C or even of earlier versions of C++. In this webinar I will give a tour of features from modern C++ relevant to HPC, along with guidelines for their use — and demonstrate that C++ can offer productivity and elegance while sacrificing nothing in performance.
-
So, You Want to Be Agile? Strategies for Introducing Agility Into Your Scientific Software Project (2019-05-08)
- Presenter: Mike Heroux (SNL)
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: Scientific software team cultures have natural consistencies with agile practices. Discovery-driven development, a focus on regular delivery of results, in-person discussions within and across research teams, and a focus on long-term sustainable research programs are commonplace dynamics on computational science teams that develop software. These dynamics are also particular expressions of core agile principles.
Many scientific software teams have already assimilated industry best practices in some aspects of their work. The advent of open software development platforms such as GitHub and GitLab have accelerated awareness and adoption, as have numerous on-line resources that enable a motivated person to continue learning new ideas and approaches. Even so, we propose that a healthy team habit is continued exploration and improvement of software practices, processes and skills.
In this webinar, we discuss a few agile practices and strategies that are readily adapted and adopted by scientific software teams. In addition, we describe an attitude and strategy for continual process improvement that enables computational science teams to simultaneously deliver science results and, at the same time, dedicate a slice of time to improving software practices on their way to delivering those results.
-
Testing Fortran Software with pFUnit (2019-04-10)
- Presenter: Thomas Clune (NASA/Goddard)
- Archives: Slides (PDF) | Videos: Webinar : Extended Q&A (YouTube) | Q&A (PDF)
- Description: Over the past two decades, the emergence of highly effective software testing frameworks has greatly simplified the development and use of unit tests and has led to new software development paradigms such as test driven development (TDD). However, technical computing introduces a number of unique testing challenges, including distributed parallelism and numerical accuracy. This webinar will begin with a basic introduction to the use of pFUnit to develop tests for MPI+Fortran software and then present some of the new capabilities in the latest release. We will also discuss some specialized methodologies for testing numerical algorithms and speculate about future framework capabilities that may improve our ability to test at exascale.
-
Parallel I/O with HDF5: Overview, Tuning, and New Features (2019-03-13)
- Presenter: Quincey Koziol (NERSC)
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: HDF5 is a data model, file format, and I/O library that has become a de facto standard for HPC applications to achieve scalable I/O and for storing and managing big data from computer modeling, large physics experiments and observations. This webinar will give an introduction to using the HDF5 library, with a focus on parallel I/O and performance tuning options. The webinar will also provide an overview of the latest performance and productivity enhancement features being developed as part of the DOE’s Exascale Computing Project (ECP) ExaHDF5 effort, and will present optimizations used in improving I/O performance of ECP applications.
-
Containers in HPC (2019-02-13)
- Presenter: Shane Canon (LBNL)
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: Containers have gained adoption in the HPC and scientific computing space through specialized runtimes like Shifter, Singularity and Charliecloud. Containers enable reproducible, shareable, portable execution of applications. In this webinar, we will give a brief introduction on how to build images and run containers on HPC systems. We will also discuss some best practices to ensure containers can take full advantage of HPC systems.
-
Quantitatively Assessing Performance Portability with Roofline (2019-01-23)
- Presenters: John Pennycook (Intel), Charlene Yang (Lawrence Berkeley National Laboratory) and Jack Deslippe (Lawrence Berkeley National Laboratory)
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: Wouldn’t it be great if we could port a code to a new high-performance architecture without substantially changing the code yet achieving a similar level of performance as hand-optimized code? This webinar will frame the discussion around ‘performance portability’, why it is important and desirable, and how to quantitatively measure it. The webinar will start with a background check on how the concept of performance portability came about and past attempts to define it and quantify it. Then we will introduce a simple yet powerful metric and an empirical methodology to quantitatively assess a code’s performance portability across multiple platforms. The methodology uses the Roofline performance model to measure an ‘architectural efficiency’ term in the metric proposed by Pennycook et al. We will dive into a few nuances of this methodology, for example, how and why empirical ceilings should be used for performance bounds, how to accurately account for complex instructions such as divides, how to model strided memory accesses, and how to select the appropriate Roofline ceilings and application performance points to make sure that the performance portability analysis is not erroneously skewed. We will also show some results of measuring performance portability using the aforementioned metric and methodology on two modern architectures, Intel Xeon Phi and NVIDIA V100 GPUs.
2018
-
Introduction to Software Licensing (2018-12-05)
- Presenter: David E. Bernholdt, ORNL
- Archives: Slides (PDF @ FigShare) | Video (YouTube) | Q&A (PDF)
- Description: Software licensing and related matters of intellectual property can often seem confusing or hopelessly complicated, especially when many present their opinions as dogma. This presentation takes a different approach: getting you to think about software licensing from the standpoint of what you want others to be able to do (or not do) with your software. We will start by developing a common understanding of the terminology used around software licenses. Then we’ll consider various scenarios of what you might want to accomplish with a software license, and what to look for in the license. We’ll also discuss some pragmatic issues around actually applying a license to your software. A list of resources will be provided to help with further exploration of these topics.
-
Open Source Best Practices: From Continuous Integration to Static Linters (2018-10-17)
- Presenter: Daniel Smith and Ben Pritchard, Molecular Sciences Software Institute (MolSSI)
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: This webinar will continue the discussion of open source software (OSS) opportunities within the scientific ecosystem to include the many cloud and local services available to OSS free of charge. The services to be discussed include continuous integration, code coverage, and static analysis. The presenters will demonstrate the usefulness of these tools and how a small time investment at the beginning is traded for long-term benefits. These services and ideas are agnostic to software language or HPC software application and should apply to any party interested in tools that help ease the burden of software maintenance.
-
Modern CMake (2018-09-19)
- Presenter: Bill Hoffman, Kitware
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: Bill Hoffman, the creator of the CMake project, will give an introduction to development with modern CMake constructs. CMake is 17 years old and has evolved over time into the most widely used C++ build tool in the world. In the past 5 years, many new features have been added to CMake to make the creation of cross-platform build files easier. This webinar will provide best practices for development and maintenance of a CMake build system. The webinar will cover the “target centric” approach to writing CMake files. In addition, testing and quality dashboards with CDash will be covered. Kitware’s experience with HPC systems and CMake will also be discussed.
-
Software Sustainability — Lessons Learned from Different Disciplines (2018-08-21)
- Presenter: Neil Chue Hong, Software Sustainability Institute (University of Edinburgh)
- Archives: Slides (PDF @ FigShare) | Video (YouTube) | Q&A (PDF)
- Description: How do you make software sustainable? How much is it about process and how much about practice? Does it vary between countries or disciplines? In this webinar, I’ll present what the UK’s Software Sustainability Institute has learned from 8 years of work in this area including efforts around understanding the scale of software use in research, raising the profile of software as a key part of the research ecosystem, and how we can enable researchers and developers to build better software.
-
How Open Source Software Supports the Largest Computers on the Planet (2018-07-18)
- Presenter: Ian Lee, Lawrence Livermore National Laboratory
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: This talk will provide an overview of the work at Lawrence Livermore National Laboratory to re-vamp our open source project offerings, release processes, and engagements across the Department of Energy and the US government through efforts such as DOECode and Code.gov. We will also discuss ongoing work to make it easier for our staff to engage with open source communities, via both the creation of new projects and contributions to existing open source projects. We believe that these experiences and insights may be useful to a wide range of developers of high-performance scientific software.
-
Popper: Creating Reproducible Computational and Data Science Experimentation Pipelines (2018-06-13)
- Presenter: Ivo Jimenez, UC Santa Cruz
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: Current approaches used in computational and data science research may require significant time without necessarily advancing scientific understanding. For example, researchers may spend countless hours reformatting data and writing code to attempt to reproduce previously published research. What if the scientific community could find a better way to create and publish workflows, data, and models that are easy to reproduce, thus streamlining scientific analysis? Popper is a protocol and command language interpreter (CLI) tool for implementing scientific exploration pipelines following a DevOps approach of unifying software development and operation in order to handle complexity in large codebases. Popper repurposes DevOps practices in the context of scientific explorations, so that researchers can leverage existing tools and technologies to enable reproducibility. This webinar will introduce the Popper protocol, including a demo of the CLI tool and HPC examples.
-
On-demand Learning for Better Scientific Software: How to Use Resources & Technology to Optimize your Productivity (2018-05-09)
- Presenter: Elaine Raybourn, Sandia National Laboratories
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: Continual advances in new technologies for computational science often require members of the HPC community to learn new tools, techniques, or processes on-demand, or outside of a formal education setting. While the variety of media and deluge of content make on-demand learning a reality, very few learners apply guiding principles from learning science to set themselves up for success. Applying on-demand learning strategies for self-paced “learning in the wild” can augment professional learning courses from EdX, Udacity, and YouTube. Employing use cases and examples from Python and Git, this webinar will demonstrate how to develop a personalized learning framework leveraging massively open online courses (MOOC), podcasts, social media, videos, and more. A walk through of relevant learning applications will be provided. Participants of this webinar will take away practical strategies, resources, and tools that can be applied toward learning more productively in general, and specifically to software development.
-
Software Citation Today and Tomorrow (2018-04-18)
- Presenter: Daniel S. Katz, NCSA and University of Illinois at Urbana-Champaign
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: Software is increasingly important in research, and some of the scholarly communications community, for example, in FORCE11, has been pushing the concept of software citations as a method to allow software developers and maintainers to get academic credit for their work: software releases are published and assigned DOIs, and software users then cite these releases when they publish research that uses the software. This webinar will discuss the state of software citation, starting with history of work done by the FORCE11 Software Citation Working Group, leading to a published set of software citation principles (https://doi.org/10.7717/peerj-cs.86), as well as other prior work. It will also talk about where the community is going, what the obstacles to progress are, and how they may be overcome.
-
Scientific Software Development with Eclipse (2018-03-28)
- Presenter: Greg Watson, ORNL
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: The Eclipse IDE is one of the most popular IDEs available, and its support for multiple languages, particularly C, C++ and Fortran has made it the go to IDE for scientific software development. Although an IDE like Eclipse can provide advanced development capabilities such as code recommendation and refactoring, these features can be difficult to utilize for complex code bases. Other challenges, such as ease of installation and use, reliability, and compatibility with existing development practices also play a role. Ultimately the usefulness of the tool is a tradeoff between the capabilities it provides and the challenges of incorporating it into the development workflow. This webinar will demonstrate some of the latest features available in Eclipse that are particularly useful for scientific application development, and examine how they can be used in a variety of different scenarios using realistic sample codes.
- Jupyter and HPC: Current State and Future Roadmap (2018-02-28)
- Presenters: Matthias Bussonnier (UC Berkeley), Suhas Somnath (ORNL), and Shereyas Cholia (NERSC)
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: During the last few years the Jupyter notebook has become one of the tools of choice for the data science and high-performance computing (HPC) communities. This webinar will provide an overview of why Jupyter is gaining traction in education, data science, and HPC, with emphasis on how notebooks can be used as interactive documents for exploration and reporting. We will present an overview of how Jupyter works and how the network protocol can be leveraged for both a local single machine and remote-cluster work. We will discuss the nuts and bolts of how Jupyter has been deployed at NERSC as a case study in implementation of Jupyter in an HPC environment. This work implies learning the Jupyter ecosystem to take advantage of its powerful abstractions to develop custom infrastructure to satisfy policies and user needs.
The webinar will show, as a use case, how Jupyter notebooks have transformed data discovery, visualization, and interactive analysis for the scanning probe and electron microscopy communities at Oak Ridge National Laboratory. It will also show how notebooks can seamlessly accommodate measurements from a wide variety of instruments through Pycroscopy, a framework for instrument agnostic data storage and analysis.
- Bringing Best Practices to a Long-Lived Production Code (2018-01-17)
- Presenter: Charles R. Ferenbaugh, LANL
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: How can you introduce best software practices to a long-lived scientific production code, with a significant user base, that has “gotten along fine” for years doing things its own way? Often developers in such projects must struggle with overly complex code, inadequate documentation, little or no software process, and a “just write the code fast” culture; these are challenges to software quality that are generally not issues for new projects. In this presentation we’ll discuss some of the peculiar problems faced by long-lived scientific codes, and present a case study of how we’re dealing with these issues at LANL in the xRage radiation-hydrodynamics simulation code.
2017
- Better Scientific Software (https://bssw.io): So your code will see the future (2017-12-06)
- Presenters: Mike Heroux, SNL, and Lois McInnes, ANL
- Archives: Slides (PDF) | Video (YouTube)
- Description: Better Scientific Software (BSSw) is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This presentation will introduce a new website (https://bssw.io )—a community exchange for scientific software improvement. We’re creating a clearinghouse to gather, discuss, and disseminate experiences, techniques, tools, and other resources to improve software productivity and sustainability for CSE. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. The backend enables collaborative content development using standard GitHub tools and processes. We need your contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!
- Managing Defects in HPC Software Development (2017-11-01)
- Presenter: Tom Evans, ORNL
- Archives: Slides (PDF) | Video (YouTube)
- Description: Software Quality Engineering (SQE) and methods research and scientific investigation are often thought to be incompatible. However, in reality they are not only compatible, but required in order to have confidence in the results of even basic scientific computations. This is especially true for parallel software. In this talk we will look at methods for performing software verification. Software verification is a method for removing defects at code construction time; these techniques can help in both algorithm and method development, as well as increased productivity.
- Barely Sufficient Project Management: A few techniques for improving your scientific software development efforts (2017-09-13)
- Presenter: Michael A. Heroux, SNL
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: Software development is an essential activity for many scientific teams. Modeling, simulation and data analysis, using team-developed software, are increasing valuable for scientific discovery and engineering. Many teams use informal, ad hoc approaches for managing their software efforts. While sufficient for many efforts, a modest emphasis on team models and processes can substantially improve developer productivity and software sustainability. In this presentation, we discuss several light-weight techniques for managing scientific software efforts. Using checklists, policy statements and a Kanban workflow system, we emphasize techniques for managing the initiation and exit of team members, approaches to synthesizing team culture, and ways to improve communication within a team and with its stakeholders.
- Using the Roofline Model and Intel Advisor (2017-08-16)
- Presenter: Sam Williams, LBNL, and Tuomas Koskela, NERSC
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: In this webinar, we will begin by introducing the Roofline Model and its “Cache-Aware” variant. We will proceed with some general guidelines and historical approaches to Roofline-based program analysis. Next, we will provide a short discussion of how changes in data locality and arithmetic intensity of two canonical benchmarks visually manifest in the context of these two Roofline formulations. Subsequently, we will provide two demonstrations of using Intel Advisor and the Roofline model within Intel Advisor. The first demo will be primarily instructive on how to compile, benchmark, and use Advisor. The second demo will focus on using variants of a simple benchmark to highlight changes in the Roofline model as well as providing correlation to Advisor’s other capabilities. We will conclude with a few comments on future directions.
- Intermediate Git (2017-07-12)
- Presenter: Roscoe A. Bartlett, SNL
- Archives: Slides (PDF) | Git Tutorial and Reference Collection (PDF) | Video (YouTube) | Q&A (PDF)
- Description: This presentation will emphasize intermediate-level tutorial and reference information about the Git version control (VC) system. This overview takes the view that the best way to learn to use Git effectively is to learn it as a data structure and a set of algorithms to manipulate that data structure. This perspective is important because the Git command-line interface is widely considered to be overly complex and confusing. For example, a Git command like ‘checkout’ can do wildly different things depending on the other arguments passed into the command or the state of the Git repository. But Git is still the dominant VC system; many people consider that Git has won the version control wars due to its power and flexibility.
- Python in HPC (2017-06-07)
- Presenters: Rollin Thomas, NERSC; William Scullin, ANL; Matt Belhorn, ORNL
- Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
- Description: Python’s powerful elegance has driven its adoption at HPC centers for job orchestration, visualization, exploratory data analysis, and even simulation. But maximizing performance from Python applications can be challenging especially on supercomputing architectures. This webinar will explain those challenges with a practical emphasis on using Python at NERSC, ALCF, and OLCF. We will outline a variety of performance optimization strategies, tools for measuring and addressing performance problems, and establish best practices for Python in HPC.
2016
- Basic Performance Analysis and Optimization – An Ant Farm Approach (2016-08-09)
- Presenter: Jack Deslippe, NERSC
- Archives: Slides (PDF) | Video (YouTube)
- Description: How is optimizing HPC applications like an Ant Farm? Attend this presentation to find out. We’ll discuss the basic concepts around optimizing code for the HPC systems of today and tomorrow. These systems require codes to effectively exploit both parallelism between nodes and an ever growing amount of parallelism on-node. We’ll discuss profiling strategies, tools (for profiling and debugging) and common issues with both internode communication and on-node parallelism. We will give an overview of traditional optimizations areas in HPC applications like parallel IO and MPI strong and weak scaling as well as topics relevant for modern GPU and many-core systems like threading, SIMD/AVX, SIMT and effectively using cache and memory hierarchies. The “Ant Farm” approach places a heavy emphasis on the roofline performance model and encouraging users to understand the compute, bandwidth and latency sensitivity of their applications and kernels through a series of easy to perform experiments and an easy to follow flow chart. Finally, we’ll discuss what we expect to change in the optimization process as we move towards exascale computers.
- An Introduction to High-Performance Parallel I/O (2016-07-28)
- Presenter: Feiyi Wang, OLCF. Feiyi Wang received his Ph.D. in Computer Engineering from North Carolina State University (NCSU). Before he joined Oak Ridge National Laboratory as research scientist, he worked at Cisco Systems and Microelectronic Center of North Carolina (MCNC) as a lead developer and principal investigator for several DARPA-funded projects. His current research interests include high performance storage system, parallel I/O and file systems, fault tolerance and system simulation, and scientific data management and integration. Dr. Wang is a Joint Faculty Professor at EECS Department of University of Tennessee and a senior member of IEEE.
- Archives: Slides (PDF) | Video (YouTube)
- Description: Parallel data management is a complex problem at large-scale HPC environments. The HPC I/O stack can be viewed as a multi-layered cake and presents an high-level abstraction to the scientists. While this abstraction shields the users from many of the I/O system details, it is very hard to obtain parallel I/O performance or functionality without understanding the end-to-end hierarchical I/O stack in today’s modern complex HPC environments. This talk will introduce the basic parallel I/O concepts and will provide guidelines on obtaining better I/O performance on large-scale parallel platforms.
- How the HPC Environment is Different from the Desktop (and Why) (2016-07-14)
- Presenter: Katherine Riley, ALCF
- Archives: Slides (PDF) | Video (YouTube)
- Description: High performance computing has transformed how science and engineering research is conducted. Answering a question in 30 minutes that used to take 6 months can quickly change the way one asks questions. Large computing facilities provide access to some of the world’s largest computing, data, and network resources in the world. Indeed, the DOE complex has the highest concentration of supercomputing capability in the world. However, by nature of their existence, making use of the largest computers in the world can be a challenging and unique task. This talk will discuss how supercomputers are unique and explain how that impacts their use.
- Testing and Documenting your Code (2016-06-15)
- Presenter: Alicia Klinvex, SNL
- Archives: Slides (PDF) | Video (YouTube)
- Description: Software verification and validation are needed for high-quality and reliable scientific codes. For software with moderate to long lifecycles, a strong automated testing regime is indispensable for continued reliability. Similarly, comprehensive and comprehensible documentation is vital for code maintenance and extensibility. This presentation will provide guidelines on testing and documentation that can help to ensure high-quality and long-lived HPC software. We will present methodologies, with examples, for developing tests and adopting regular automated testing. We also will provide guidelines for minimum, adequate, and good documentation practices depending on the available resources of the development team.
- Presenter: Alicia Klinvex, SNL
- Distributed Version Control and Continuous Integration Testing (2016-06-02 )
- Presenter: Jeff Johnson, LBNL
- Archives: Slides (PDF) | Video (YouTube)
- Description: Recently, many tools and workflows have emerged in the software industry that have greatly enhanced the productivity of development teams. GitHub, a site that hosts projects in Git repositories, is a popular platform for open source and closed source projects. GitHub has encoded several best practices into easily followed procedures such as pull requests, which enrich the software engineering vocabularies of non-professionals and professionals alike. GitHub also provides integration to other services (for example, continuous integration such as Travis CI, which allows code changes to be automatically tested before they are merged into a master development branch). This presentation will discuss how to set up a project on GitHub, illustrate the use of pull requests to incorporate code changes, and show how Travis CI can be used to boost confidence that changes will not break existing code.
- Presenter: Jeff Johnson, LBNL
- Developing, Configuring, Building, and Deploying HPC Software (2016-05-18)
- Presenter: Barry Smith, ANL
- Archives: Slides (PDF) | Video (YouTube)
- Description: The process of developing HPC software requires consideration of issues in software design as well as practices that support the collaborative writing of well-structured code that is easy to maintain, extend, and support. This presentation will provide an overview of development environments and how to configure, build, and deploy HPC software using some of the tools that are frequently used in the community. We will also discuss ways in which these and other tools are best utilized by various categories of scientific software developers, ranging from small teams (for example, a faculty member and graduate students who are writing research code intended primarily for their own use) through moderate/large teams (for example, collaborating developers spread among multiple institutions who are writing publicly distributable code intended for use by others in the community).
- Presenter: Barry Smith, ANL
- What All Codes Should Do: Overview of Best Practices in HPC Software Development (2016-05-04)
- Presenter: Anshu Dubey, ANL
- Archives: Slides (PDF) | Video (YouTube)
- Description: Scientific code developers have increasingly been adopting software processes derived from the mainstream (non-scientific) community. Software practices are typically adopted when continuing without them becomes impractical. However, many software best practices need modification and/or customization, partly because the codes are used for research and exploration, and partly because of the combined funding and sociological challenges. This presentation will describe the lifecycle of scientific software and important ways in which it differs from other software development. We will provide a compilation of software engineering best practices that have generally been found to be useful by science communities, and we will provide guidelines for adoption of practices based on the size and the scope of the project.
- Presenter: Anshu Dubey, ANL