Best Practices for HPC Software Developers (Webinars)

Jump to: About the Series | Upcoming webinars | Past webinars | 2020201920182017 | 2016

Upcoming Webinars

Webinars are free and open to the public, but registration is required.

  1. Scalable Precision Tuning of Numerical Software [Register]

    • Date and Time:  Wednesday, October 14, 2020, 01:00 pm ET
    • Presenter: Cindy Rubio-Gonzalez (University of California, Davis)
    • Decription: The use of numerical software has grown rapidly over the past few years, providing the foundation for a large variety of applications including scientific software and machine learning. Given the variety of numerical errors that can occur, floating-point programs are difficult to write, test and debug. One common practice among developers is to use the highest available precision when allocating variables. While more robust, this can degrade program performance significantly. This webinar describes our research on developing tools to assist programmers in tuning the precision of their floating-point programs. These tools conduct a data-driven approach to search over the types of floating-point variables to lower their precision subject to accuracy constraints and performance goals. In the last part of the webinar, I will discuss challenges and opportunities for scalable precision tuning of large HPC applications.

About the Webinar Series

The HPC Best Practices webinars address issues faced by developers of computational science and engineering (CSE) software on high-performance computers  (HPC). The sessions are independent, so join any or all.

Who should attend: Participation is free and open to the public, however registration is required for each event. This series is designed for HPC software developers who are seeking help in increasing their team’s productivity, as well as facility staff who interact extensively with users.

Schedule and format: The webinars will occur approximately monthly and last about one hour each. The webinars usually take place on a Wednesday at 1:00-2:00pm ET (but this can change due to speaker availability). Audience questions and discussion will be encouraged, however due to the number of participants, we use the webinar tool’s chat capability and a shared Google Doc to do this in written form.  Recordings of the webinars along with the presentation slides will be posted.

Notifications: If you’d like to receive announcements of upcoming webinars and other IDEAS organized events, and followups when recordings become available, please subscribe to our mailing list.

Organizers: These webinars have been organized by the IDEAS project in collaboration with the DOE/ASCR computing facilities (ALCF, NERSC, and OLCF), and the Exascale Computing Project (ECP).

Logos of organizers

Suggestions Welcome! Want to request another topic?  Want to give a webinar?  Email us at IDEASProductivity@gmail.com.

Past Webinars

Listed in reverse chronological order.

2020

  1. Testing and Code Review Practices in Research Software Development (2020-09-09)

    • Presenter: Nasir Eisty (California Polytechnic State University, San Luis Obispo)
    • Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
    • Decription: Software quality in a research context is essential because research software is used in mission-critical situations, decision making, and computation of evidence for research publications. This webinar will cover the use of two software quality practices in the development of research software: software testing and peer code review. These practices in software development can lead to both improved scientific results through higher quality software in the short term and more maintainable software in the long term. While these practices are essential for any type of software, developers of research software typically do not use peer code review and software testing as frequently as they could for maximum impact. The presenter will discuss the motivation, challenges, barriers, and necessary improvements to make the practices effective for research software development, based on studies of the research software community conducted via interviews, surveys, workshops, and tutorials.
  2. Colormapping Strategies for Large Multivariate Data in Scientific Applications (2020-08-12)

    • Presenter: Francesca Samsel (Texas Advanced Computing Center)
    • Archives: Recording (YouTube) | Slides (PDF)
    • Decription: In order for scientific visualizations to effectively convey insights of computationally-driven research, as well as to better engage the public in science, visualizations must effectively and affectively facilitate the exploration of information. The presenter and her team employ a transdisciplinary approach, that includes insights from artistic color theory, perceptual science, the visualization community, and domain scientists, to move beyond basic default colormaps. While color has always been utilized and studied as a component of scientific data visualization, it has been demonstrated that its full potential for discovery and communication of scientific data remains untapped. The webinar will discuss how effective color use can reveal structures, relationships, and hierarchies between variables within a visualization, as well as practical strategies and workflows for tailor color application to the goals of the visualization. The presenter’s work is documented and freely available for use at SciVisColor.org, a hub for research and resources related to color in scientific visualization. SciVisColor provides tools and strategies that allow scientists to use color as a tool to better understand and communicate their data. Users can explore and download colormaps, color sets, and ColorMoves an interactive interface for using color in scientific visualization above. The webinar will introduce concepts that can help developers make design decisions when writing simulation codes, to make better use of scientific visualization tools and visualize results more effectively.
  3. What’s New in Spack?
    (2020-07-15)

    • Presenter: Todd Gamblin (Lawrence Livermore National Laboratory)
    • Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
    • Decription: Spack is a package manager for scientific computing, with a rapidly growing open source community. With over 500 contributors from academia, industry, and government laboratories, Spack has a wide range of use cases, from small-scale development on laptops and clusters, to software release management for the U.S. Exascale Computing Project, to user software deployment on 6 of the top 10 supercomputer sites in the world.Spack isn’t just for facilities, though! As a package manager, Spack is in a powerful position to impact DevOps and daily software development workflows. Spack has virtual environments that enable the “manifest and lock” model popularized by more mainstream dependency management tools. New releases of Spack include direct support for creating containers and gitlab CI pipelines for building environments. This webinar will cover new features as well as the near- and long-term roadmap for Spack.
  4. SYCL – Introduction and Best Practices (2020-06-17)

    • Presenter: Thomas Applencourt (Argonne National Laboratory)
    • Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
    • Decription: SYCL is a single-source heterogeneous programming model based on standard C++. It uses C++ templates and lambda functions for host and device code. SYCL builds on the underlying concepts of portability and efficiency of OpenCL that enable code for heterogeneous processors; however, it is less verbose than OpenCL. The single-source programming enables the host and kernel code for an application to be contained in the same source file, in a type-safe way and with the simplicity of a cross-platform asynchronous task graph. We will provide an overview of the SYCL concepts, compilation, and runtime. No prior knowledge of OpenCL is required for this webinar. Once we have reviewed the core concepts of SYCL, we will walk through several code examples to highlight its key features and illustrate best practices. SYCL by design is hardware agnostic and offers the potential to be portable across many of DOE’s largest machines.
  5. Accelerating Numerical Software Libraries with Multi-Precision Algorithms (2020-05-13)

    • Presenters: Hartwig Anzt (Karlsruhe Institute of Technology), and Piotr Luszczek (University of Tennessee)
    • Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
    • Decription: With the rise of machine learning, more hardware manufacturers are introducing low-precision special function units in processor designs, often achieving up to an order or magnitude higher performance than in the IEEE double precision that is typically used as working precision in scientific computing. At the same time, a rapidly expanding landscape of mixed- and multi-precision methods generate high-quality solutions that leverage the higher compute power of reduced precision. This webinar will introduce the concept of floating point formats and the IEEE standard. We will demonstrate how using an iterative or direct solver in lower precision impacts the solution quality. We will outline several strategies that aim to preserve numerical stability and high solution quality while still computing, at least partially, in lower precision. We will present several multi-precision algorithms that have proven particularly successful and elaborate on their realization and usage. We also will introduce open source production-quality multi-precision software packages and show their integration and efficiency for scientific applications. The webinar will focus on lessons learned and generally applicable strategies.
  6. Best Practices for Using Proxy Applications as Benchmarks (2020-04-15)

    • Presenters: David Richards (Lawrence Livermore National Laboratory), and Joe Glenski (Hewlett-Packard Enterprise)
    • Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
    • Decription: Proxy applications have many uses in software development and hardware/software co-design. Because most proxies are easy to build, run, and understand, they are especially appealing for use in benchmark suites and studies. This webinar will examine the role of proxy apps as benchmarks and explain why run rules and a figure of merit are essential for a proxy application to function as an effective benchmark. We will show how to evaluate the fidelity of benchmarks as a model for actual workloads and provide tips on creating problem specifications and other run rules. We will discuss what DOE facilities are looking for when they assemble benchmark suites for use in procurements. Finally, we will explain how system vendors use our benchmark suites and what practices they view as most (and least) effective.
  7. Testing: Strategies When Learning Programming Models and Using High-Performance Libraries (2020-03-18)

    • Presenter: Balint Joo (Jefferson Lab)
    • Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
    • Decription: Software testing is an invaluable practice, albeit the level of testing in scientific applications can vary widely, from no testing at all to full continuous integration (as discussed in earlier webinars of the HPC-BP series). In this webinar I will consider a specific case: the use of unit-testing when developing a mini-app as an approach to learn about new programming models such as Kokkos and SYCL, or when using (or contributing to) high-performance libraries. I will illustrate with an example from Lattice QCD, focusing on the integration of the QUDA optimized library with the Chroma application. The webinar will focus on lessons learned and generally applicable strategies.
  8. Introduction to Kokkos (2020-02-19)

    • Presenter: Christian Trott (Sandia National Laboratories)
    • Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
    • Decription: The Kokkos C++ Performance Portability Ecosystem is a production-level solution for writing modern C++ applications in an hardware-agnostic way. It is part of the US Department of Energy’s Exascale Computing Project—the leading effort in the US to prepare the HPC community for the next generation of supercomputing platforms. Kokkos is now used by more than a hundred HPC projects, and Kokkos-based codes are running regularly at-scale on at least five of the top ten supercomputers in the world. In this webinar, we will give a short overview of what the Kokkos Ecosystem provides, including its programming model, math kernels library, tools, and training resources, before providing an overview of the Kokkos team’s efforts surrounding the ISO-C++ standard, and how Kokkos both influences future standards and aligns with developments occurring in them. The webinar will include a status update on the progress in supporting the upcoming exascale class HPC systems announced by DOE.
  9. Refactoring EXAALT MD for Emerging Architectures (2020-01-15)

    • Presenters: Aidan Thompson (Sandia National Laboratories), Stan Moore (Sandia National Laboratories), and Rahulkumar Gayatri (National Energy Research Scientific Computing Center)
    • Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
    • Decription: As part of the DOE Exascale Computing Project, members of the EXAALT project are working to increase the accuracy, time, and length scales of molecular dynamics simulations of materials for fusion energy. Simulations rely on the SNAP machine-learning interatomic potential to accurately capture material properties. The SNAP kernel recursively evaluates a set of complex polynomial functions, requiring many deeply nested loops with irregular loop bounds. Last year, a worrisome trend in the SNAP force kernel was identified. With each new generation of emerging architectures, performance relative to theoretical peak was decreasing, particularly on GPUs. This webinar will discuss the approach used to rewrite the SNAP kernel from the ground up, using more compact memory representation, refactoring the main loop, using sub-kernels to reduce pressure on GPU threads, and improving coalesced memory accesses on the GPU. This work has enabled a spectacular increase of roughly 10x in performance over the baseline implementation of the SNAP benchmark running on NVIDIA V100 GPUs. Extrapolated to the full machine, this predicts an increase of over 100x in the Figure of Merit over the baseline on the ALCF/Mira system, putting EXAALT on track to meeting, and even exceeding performance targets on exascale systems. The webinar will emphasize key strategies and lessons learned in code transitions for emerging architectures.

2019

  1. Building Community through xSDK Software Policies (2019-12-11)

    • Presenters: Ulrike Meier Yang (Lawrence Livermore National Laboratory), and Piotr Luszczek (University of Tennessee)
    • Archives: Recording (YouTube) | Slides (PDF) | Q&A (PDF)
    • Decription: The development of increasingly complex computer architectures and software ecosystems continues. Applications that incorporate multiphysics modeling as well as the coupling of simulation and data analytics increasingly require the combined use of software packages developed by diverse, independent teams throughout the HPC community. The Extreme-scale Scientific Software Development Kit (xSDK) is being developed to provide coordinated infrastructure for independent mathematical libraries to support the productive and efficient development of high-quality applications. This webinar will discuss the development and impact of xSDK community policies, which constitute an integral part of the project and have been defined to achieve improved code quality and compatibility across xSDK member packages and a sustainable software ecosystem.
  2. Tools and Techniques for Floating-Point Analysis (2019-10-16)

    • Presenter: Ignacio Laguna (Lawrence Livermore National Laboratory)
    • Archives:  Recording (YouTube) | Slides (PDF) | Q&A (PDF)
    • Description: Scientific software is central to the practice of research computing. While software is widely used in many science and engineering disciplines to simulate real-world phenomena, developing accurate and reliable scientific software is notoriously difficult. One of the most serious difficulties comes from dealing with floating-point arithmetic to perform numerical computations. Round-off errors occur and accumulate at all levels of computation, while compiler optimizations and low-precision arithmetic can significantly affect the final computational results. With accelerators such as GPUs dominating high-performance computing systems, computational scientists are faced with even bigger challenges, given that ensuring numerical reproducibility in these systems poses a very difficult problem. This webinar provides highlights from a half-day tutorial discussing tools that are available today to analyze floating-point scientific software. We focus on tools that allow programmers to get insight about how different aspects of floating-point arithmetic affect their code and how to fix potential bugs.
  3. Discovering and Addressing Social Challenges in the Evolution of Scientific Software Projects (2019-09-11)

    • Presenter: Rene Gassmoeller (UC Davis)
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: In recent years scientific software projects have increasingly incorporated state-of-the-art technical best practices like continuous integration into their development cycle. However, many projects still struggle to create and maintain an active and welcoming user/developer community, and there exists little documentation on what makes a scientific software community successful. In this webinar I will introduce my work — as a Better Scientific Software Fellow — to collect typical social challenges and potential solutions that arise during the evolution of a scientific software project. Aimed at current and prospective software maintainers and community leaders, I will discuss topics such as building and maintaining a welcoming community atmosphere, overcoming skepticism of sharing science and software, mediating between users working on conflicting topics or publications, and providing credit and growth opportunities for community members. Finally, I hope to initiate a conversation among project and community leaders about what makes communities successful so that we can learn from each other and improve scientific software development together. 
  4. Software Management Plans in Research Projects  (August 2019)

    • Presenter: Shoaib Sufi (Software Sustainability Institute)
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: Software is a necessary by-product of research. Software in this context can range from small shell scripts to complex and layered software ecosystems. Dealing with software as a first class citizen at the time of grant formulation is aided by the development of a Software Management Plan (SMP). An SMP can help to formalize a set of structures and goals that ensure your software is accessible and reusable in the short, medium and long term. SMP’s aim at becoming for software what Data Management Plans (DMP’s) have become for research data (DMP’s are mandatory for National Science Foundation grants). This webinar takes you through the questions you should consider when developing a Software Management Plan, how to manage the implementation of the plan, and some of the current motivation driving discussion in this area of research management.
  5. When 100 Flops/Watt was a Giant Leap: The Apollo Guidance Computer Hardware, Software and Application in Moon Missions (July 2019)

    • Presenter: Mark Miller (Lawrence Livermore National Laboratory)
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: Commemorating the 50th Anniversary of the Apollo Moon landings, this webinar will describe the revolutionary computer, the Apollo Guidance Computer (AGC). The AGC made autonomous travel to the Moon and back not only possible but added profoundly to crew safety, flight profile accuracy and even optimized propellant use to such an extent that final missions plans traded fuel for added weight in equipment and lunar samples. The webinar will give an overview of the AGC hardware architecture, the guidance software it executed as well as the pioneering efforts in developing both. HPC/CSE code teams will discover many familiar themes such as flops/watt power constraints and performance portability challenges. The webinar will conclude with several user stories about the actual operation of the AGC in various Apollo missions.
  6. Modern C++ for High-Performance Computing (June 2019)

    • Presenter: Andrew Lumsdaine (PNNL and University of Washington)
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: Since its creation by Bjarne Stroustrup in the early 1980s, C++ has steadily evolved to become a multi-paradigm programming language that fully supports the needs of modern programmers. Because C++ had its roots in the C programming language, conventional wisdom (and longstanding practice) had been to use C++ in a dichotomous fashion: abstractions for productivity with escape to C for performance. However, C++ today is best viewed holistically — as it is today — rather than as extension of C or even of earlier versions of C++. In this webinar I will give a tour of features from modern C++ relevant to HPC, along with guidelines for their use — and demonstrate that C++ can offer productivity and elegance while sacrificing nothing in performance.
  7. So, You Want to Be Agile? Strategies for Introducing Agility Into Your Scientific Software Project (2019-05-08)

    • Presenter: Mike Heroux (SNL)
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: Scientific software team cultures have natural consistencies with agile practices. Discovery-driven development, a focus on regular delivery of results, in-person discussions within and across research teams, and a focus on long-term sustainable research programs are commonplace dynamics on computational science teams that develop software. These dynamics are also particular expressions of core agile principles.

      Many scientific software teams have already assimilated industry best practices in some aspects of their work. The advent of open software development platforms such as GitHub and GitLab have accelerated awareness and adoption, as have numerous on-line resources that enable a motivated person to continue learning new ideas and approaches. Even so, we propose that a healthy team habit is continued exploration and improvement of software practices, processes and skills.

      In this webinar, we discuss a few agile practices and strategies that are readily adapted and adopted by scientific software teams. In addition, we describe an attitude and strategy for continual process improvement that enables computational science teams to simultaneously deliver science results and, at the same time, dedicate a slice of time to improving software practices on their way to delivering those results.

  8. Testing Fortran Software with pFUnit (2019-04-10)

    • Presenter: Thomas Clune (NASA/Goddard)
    • Archives: Slides (PDF) | Videos: Webinar : Extended Q&A (YouTube) | Q&A (PDF)
    • Description: Over the past two decades, the emergence of highly effective software testing frameworks has greatly simplified the development and use of unit tests and has led to new software development paradigms such as test driven development (TDD). However, technical computing introduces a number of unique testing challenges, including distributed parallelism and numerical accuracy. This webinar will begin with a basic introduction to the use of pFUnit to develop tests for MPI+Fortran software and then present some of the new capabilities in the latest release. We will also discuss some specialized methodologies for testing numerical algorithms and speculate about future framework capabilities that may improve our ability to test at exascale.
  9. Parallel I/O with HDF5: Overview, Tuning, and New Features (2019-03-13)

    • Presenter: Quincey Koziol (NERSC)
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: HDF5 is a data model, file format, and I/O library that has become a de facto standard for HPC applications to achieve scalable I/O and for storing and managing big data from computer modeling, large physics experiments and observations. This webinar will give an introduction to using the HDF5 library, with a focus on parallel I/O and performance tuning options. The webinar will also provide an overview of the latest performance and productivity enhancement features being developed as part of the DOE’s Exascale Computing Project (ECP) ExaHDF5 effort, and will present optimizations used in improving I/O performance of ECP applications.
  10. Containers in HPC (2019-02-13)

    • Presenter: Shane Canon (LBNL)
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: Containers have gained adoption in the HPC and scientific computing space through specialized runtimes like Shifter, Singularity and Charliecloud. Containers enable reproducible, shareable, portable execution of applications. In this webinar, we will give a brief introduction on how to build images and run containers on HPC systems. We will also discuss some best practices to ensure containers can take full advantage of HPC systems.
  11. Quantitatively Assessing Performance Portability with Roofline (2019-01-23)

    • Presenters: John Pennycook (Intel), Charlene Yang (Lawrence Berkeley National Laboratory) and Jack Deslippe (Lawrence Berkeley National Laboratory)
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: Wouldn’t it be great if we could port a code to a new high-performance architecture without substantially changing the code yet achieving a similar level of performance as hand-optimized code? This webinar will frame the discussion around ‘performance portability’, why it is important and desirable, and how to quantitatively measure it. The webinar will start with a background check on how the concept of performance portability came about and past attempts to define it and quantify it. Then we will introduce a simple yet powerful metric and an empirical methodology to quantitatively assess a code’s performance portability across multiple platforms. The methodology uses the Roofline performance model to measure an ‘architectural efficiency’ term in the metric proposed by Pennycook et al. We will dive into a few nuances of this methodology, for example, how and why empirical ceilings should be used for performance bounds, how to accurately account for complex instructions such as divides, how to model strided memory accesses, and how to select the appropriate Roofline ceilings and application performance points to make sure that the performance portability analysis is not erroneously skewed. We will also show some results of measuring performance portability using the aforementioned metric and methodology on two modern architectures, Intel Xeon Phi and NVIDIA V100 GPUs.

2018

  1. Introduction to Software Licensing  (2018-12-05)

    • Presenter: David E. Bernholdt, ORNL
    • Archives: Slides (PDF @ FigShare) | Video (YouTube) | Q&A (PDF)
    • Description: Software licensing and related matters of intellectual property can often seem confusing or hopelessly complicated, especially when many present their opinions as dogma. This presentation takes a different approach: getting you to think about software licensing from the standpoint of what you want others to be able to do (or not do) with your software. We will start by developing a common understanding of the terminology used around software licenses. Then we’ll consider various scenarios of what you might want to accomplish with a software license, and what to look for in the license. We’ll also discuss some pragmatic issues around actually applying a license to your software. A list of resources will be provided to help with further exploration of these topics.
  2. Open Source Best Practices: From Continuous Integration to Static Linters  (2018-10-17)

    • Presenter: Daniel Smith and Ben Pritchard, Molecular Sciences Software Institute (MolSSI)
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: This webinar will continue the discussion of open source software (OSS) opportunities within the scientific ecosystem to include the many cloud and local services available to OSS free of charge. The services to be discussed include continuous integration, code coverage, and static analysis. The presenters will demonstrate the usefulness of these tools and how a small time investment at the beginning is traded for long-term benefits. These services and ideas are agnostic to software language or HPC software application and should apply to any party interested in tools that help ease the burden of software maintenance.
  3. Modern CMake  (2018-09-19)

    • Presenter: Bill Hoffman, Kitware
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: Bill Hoffman, the creator of the CMake project, will give an introduction to development with modern CMake constructs. CMake is 17 years old and has evolved over time into the most widely used C++ build tool in the world. In the past 5 years, many new features have been added to CMake to make the creation of cross-platform build files easier. This webinar will provide best practices for development and maintenance of a CMake build system. The webinar will cover the “target centric” approach to writing CMake files. In addition, testing and quality dashboards with CDash will be covered. Kitware’s experience with HPC systems and CMake will also be discussed.
  4. Software Sustainability — Lessons Learned from Different Disciplines (2018-08-21)

    • Presenter: Neil Chue Hong, Software Sustainability Institute (University of Edinburgh)
    • Archives: Slides (PDF @ FigShare) | Video (YouTube) | Q&A (PDF)
    • Description: How do you make software sustainable? How much is it about process and how much about practice? Does it vary between countries or disciplines? In this webinar, I’ll present what the UK’s Software Sustainability Institute has learned from 8 years of work in this area including efforts around understanding the scale of software use in research, raising the profile of software as a key part of the research ecosystem, and how we can enable researchers and developers to build better software.
  5. How Open Source Software Supports the Largest Computers on the Planet (2018-07-18)

    • Presenter: Ian Lee, Lawrence Livermore National Laboratory
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: This talk will provide an overview of the work at Lawrence Livermore National Laboratory to re-vamp our open source project offerings, release processes, and engagements across the Department of Energy and the US government through efforts such as DOECode and Code.gov. We will also discuss ongoing work to make it easier for our staff to engage with open source communities, via both the creation of new projects and contributions to existing open source projects.  We believe that these experiences and insights may be useful to a wide range of developers of high-performance scientific software.
  6. Popper: Creating Reproducible Computational and Data Science Experimentation Pipelines (2018-06-13)

    • Presenter: Ivo Jimenez, UC Santa Cruz
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: Current approaches used in computational and data science research may require significant time without necessarily advancing scientific understanding. For example, researchers may spend countless hours reformatting data and writing code to attempt to reproduce previously published research. What if the scientific community could find a better way to create and publish workflows, data, and models that are easy to reproduce, thus streamlining scientific analysis? Popper is a protocol and command language interpreter (CLI) tool for implementing scientific exploration pipelines following a DevOps approach of unifying software development and operation in order to handle complexity in large codebases. Popper repurposes DevOps practices in the context of scientific explorations, so that researchers can leverage existing tools and technologies to enable reproducibility. This webinar will introduce the Popper protocol, including a demo of the CLI tool and HPC examples.
  7. On-demand Learning for Better Scientific Software: How to Use Resources & Technology to Optimize your Productivity (2018-05-09)

    • Presenter: Elaine Raybourn, Sandia National Laboratories
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: Continual advances in new technologies for computational science often require members of the HPC community to learn new tools,  techniques, or processes on-demand, or outside of a formal education setting. While the variety of media and deluge of content make on-demand learning a reality, very few learners apply guiding principles from learning science to set themselves up for success. Applying on-demand learning strategies for self-paced “learning in the wild” can augment professional learning courses from EdX, Udacity, and YouTube. Employing use cases and examples from Python and Git, this webinar will demonstrate how to develop a personalized learning framework leveraging massively open online courses (MOOC), podcasts, social media, videos, and more. A walk through of relevant learning applications will be provided. Participants of this webinar will take away practical strategies, resources, and tools that can be applied toward learning more productively in general, and specifically to software development. 
  8. Software Citation Today and Tomorrow  (2018-04-18)

    • Presenter: Daniel S. Katz, NCSA and University of Illinois at Urbana-Champaign
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description:  Software is increasingly important in research, and some of the scholarly communications community, for example, in FORCE11, has  been pushing the concept of software citations as a method to allow software developers and maintainers to get academic credit for their work: software releases are published and assigned DOIs, and software users then cite these releases when they publish research that uses the  software. This webinar will discuss the state of software citation, starting with history of work done by the FORCE11 Software Citation Working Group, leading to a published set of software citation principles (https://doi.org/10.7717/peerj-cs.86), as well as other prior work. It will also talk about where the community is going, what the obstacles to progress are, and how they may be overcome.
  9. Scientific Software Development with Eclipse  (2018-03-28)

    • Presenter: Greg Watson, ORNL
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description:  The Eclipse IDE is one of the most popular IDEs available, and its support for multiple languages, particularly C, C++ and Fortran has made it the go to IDE for scientific software development. Although an IDE like Eclipse can provide advanced development capabilities such as code recommendation and refactoring, these features can be difficult to utilize for complex code bases. Other challenges, such as ease of installation and use, reliability, and compatibility with existing development practices also play a role. Ultimately the usefulness of the tool is a tradeoff between the capabilities it provides and the challenges of incorporating it into the development workflow. This webinar will demonstrate some of the latest features available in Eclipse that are particularly useful for scientific application development, and examine how they can be used in a variety of different scenarios using realistic sample codes.
  10. Jupyter and HPC: Current State and Future Roadmap (2018-02-28)
    • Presenters: Matthias Bussonnier (UC Berkeley), Suhas Somnath (ORNL), and Shereyas Cholia (NERSC)
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description:  During the last few years the Jupyter notebook has become one of the tools of choice for the data science and high-performance computing (HPC) communities. This webinar will provide an overview of why Jupyter is gaining traction in education, data science, and HPC, with emphasis on how notebooks can be used as interactive documents for exploration and reporting.  We will present an overview of how Jupyter works and how the network protocol can be leveraged for both a local single machine and remote-cluster work.  We will discuss the nuts and bolts of how Jupyter has been deployed at NERSC as a case study in implementation of Jupyter in an HPC environment. This work implies learning the Jupyter ecosystem to take advantage of its powerful abstractions to develop custom infrastructure to satisfy policies and user needs.
      The webinar will show, as a use case, how Jupyter notebooks have transformed data discovery, visualization, and interactive analysis for the scanning probe and electron microscopy communities at Oak Ridge National Laboratory. It will also show how notebooks can seamlessly accommodate measurements from a wide variety of instruments through Pycroscopy, a framework for instrument agnostic data storage and analysis.
  11. Bringing Best Practices to  a Long-Lived Production Code (2018-01-17)
    • Presenter: Charles R. Ferenbaugh, LANL
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: How can you introduce best software practices to a long-lived scientific production code, with a significant user base, that has “gotten along fine” for years doing things its own way? Often developers in such projects must struggle with overly complex code, inadequate documentation, little or no software process, and a “just write the code fast” culture; these are challenges to software quality that are generally not issues for new projects. In this presentation we’ll discuss some of the peculiar problems faced by long-lived scientific codes, and present a case study of how we’re dealing with these issues at LANL in the xRage radiation-hydrodynamics simulation code.

2017

  1. Better Scientific Software (https://bssw.io): So your code will see the future  (2017-12-06)
    • Presenters: Mike Heroux, SNL, and Lois McInnes, ANL
    • Archives: Slides (PDF) | Video (YouTube)
    • Description: Better Scientific Software (BSSw) is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE).  This presentation will introduce a new website (https://bssw.io )—a community exchange for scientific software improvement.  We’re creating a clearinghouse to gather, discuss, and disseminate experiences, techniques, tools, and other resources to improve software productivity and sustainability for CSE. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. The backend enables collaborative content development using standard GitHub tools and processes.  We need your contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community.  Join us!
  2. Managing Defects in HPC Software Development (2017-11-01)
    • Presenter: Tom Evans, ORNL
    • Archives: Slides (PDF) | Video (YouTube)
    • Description: Software Quality Engineering (SQE) and methods research and scientific investigation are often thought to be incompatible.  However, in reality they are not only compatible, but required in order to have confidence in the results of even basic scientific computations.  This is especially true for parallel software.  In this talk we will look at methods for performing software verification.  Software verification is a method for removing defects at code construction time; these techniques can help in both algorithm and method development, as well as increased productivity.
  3. Barely Sufficient Project Management: A few techniques for improving your scientific software development efforts (2017-09-13)
    • Presenter: Michael A. Heroux, SNL
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: Software development is an essential activity for many scientific teams.  Modeling, simulation and data analysis, using team-developed software, are increasing valuable for scientific discovery and engineering. Many teams use informal, ad hoc approaches for managing their software efforts.  While sufficient for many efforts, a modest emphasis on team models and processes can substantially improve developer productivity and software sustainability. In this presentation, we discuss several light-weight techniques for managing scientific software efforts.  Using checklists, policy statements and a Kanban workflow system, we emphasize techniques for managing the initiation and exit of team members, approaches to synthesizing team culture, and ways to improve communication within a team and with its stakeholders.
  4. Using the Roofline Model and Intel Advisor (2017-08-16)
    • Presenter: Sam Williams, LBNL, and Tuomas Koskela, NERSC
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: In this webinar, we will begin by introducing the Roofline Model and its “Cache-Aware” variant. We will proceed with some general guidelines and historical approaches to Roofline-based program analysis. Next, we will provide a short discussion of how changes in data locality and arithmetic intensity of two canonical benchmarks visually manifest in the context of these two Roofline formulations. Subsequently, we will provide two demonstrations of using Intel Advisor and the Roofline model within Intel Advisor. The first demo will be primarily instructive on how to compile, benchmark, and use Advisor. The second demo will focus on using variants of a simple benchmark to highlight changes in the Roofline model as well as providing correlation to Advisor’s other capabilities. We will conclude with a few comments on future directions.
  5. Intermediate Git (2017-07-12)
    • Presenter: Roscoe A. Bartlett, SNL
    • Archives: Slides (PDF) | Git Tutorial and Reference Collection (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: This presentation will emphasize intermediate-level tutorial and reference information about the Git version control (VC) system. This overview takes the view that the best way to learn to use Git effectively is to learn it as a data structure and a set of algorithms to manipulate that data structure. This perspective is important because the Git command-line interface is widely considered to be overly complex and confusing. For example, a Git command like ‘checkout’ can do wildly different things depending on the other arguments passed into the command or the state of the Git repository.  But Git is still the dominant VC system; many people consider that Git has won the version control wars due to its power and flexibility. 
  6. Python in HPC (2017-06-07)
    • Presenters: Rollin Thomas, NERSC; William Scullin, ANL; Matt Belhorn, ORNL
    • Archives: Slides (PDF) | Video (YouTube) | Q&A (PDF)
    • Description: Python’s powerful elegance has driven its adoption at HPC centers for job orchestration, visualization, exploratory data analysis, and even simulation.  But maximizing performance from Python applications can be challenging especially on supercomputing architectures.  This webinar will explain those challenges with a practical emphasis on using Python at NERSC, ALCF, and OLCF.  We will outline a variety of performance optimization strategies, tools for measuring and addressing performance problems, and establish best practices for Python in HPC.

2016

  1. Basic Performance Analysis and Optimization – An Ant Farm Approach (2016-08-09)
    • Presenter: Jack Deslippe, NERSC
    • Archives: Slides (PDF) | Video (YouTube)
    • Description: How is optimizing HPC applications like an Ant Farm? Attend this presentation to find out. We’ll discuss the basic concepts around optimizing code for the HPC systems of today and tomorrow. These systems require codes to effectively exploit both parallelism between nodes and an ever growing amount of parallelism on-node. We’ll discuss profiling strategies, tools (for profiling and debugging) and common issues with both internode communication and on-node parallelism. We will give an overview of traditional optimizations areas in HPC applications like parallel IO and MPI strong and weak scaling as well as topics relevant for modern GPU and many-core systems like threading, SIMD/AVX, SIMT and effectively using cache and memory hierarchies. The “Ant Farm” approach places a heavy emphasis on the roofline performance model and encouraging users to understand the compute, bandwidth and latency sensitivity of their applications and kernels through a series of easy to perform experiments and an easy to follow flow chart. Finally, we’ll discuss what we expect to change in the optimization process as we move towards exascale computers.
  2. An Introduction to High-Performance Parallel I/O (2016-07-28)
    • Presenter: Feiyi Wang, OLCF. Feiyi Wang received his Ph.D. in Computer Engineering from North Carolina State University (NCSU). Before he joined Oak Ridge National Laboratory as research scientist, he worked at Cisco Systems and Microelectronic Center of North Carolina (MCNC) as a lead developer and principal investigator for several DARPA-funded projects.  His current research interests include high performance storage system, parallel I/O and file systems, fault tolerance and system simulation, and scientific data management and integration.  Dr. Wang is a Joint Faculty Professor at EECS Department of University of Tennessee and a senior member of IEEE.
    • Archives: Slides (PDF) | Video (YouTube)
    • Description: Parallel data management is a complex problem at large-scale HPC environments. The HPC I/O stack can be viewed as a multi-layered cake and presents an high-level abstraction to the scientists. While this abstraction shields the users from many of the I/O system details, it is very hard to obtain parallel I/O performance or functionality without understanding the end-to-end hierarchical I/O stack in today’s modern complex HPC environments. This talk will introduce the basic parallel I/O concepts and will provide guidelines on obtaining better I/O performance on large-scale parallel platforms.
  3. How the HPC Environment is Different from the Desktop (and Why)  (2016-07-14)
    • Presenter: Katherine Riley, ALCF
    • Archives: Slides (PDF) | Video (YouTube)
    • Description: High performance computing has transformed how science and engineering research is conducted.  Answering a question in 30 minutes that used to take 6 months can quickly change the way one asks questions.  Large computing facilities provide access to some of the world’s largest computing, data, and network resources in the world.  Indeed, the DOE complex has the highest concentration of supercomputing capability in the world.  However, by nature of their existence, making use of the largest computers in the world can be a challenging and unique task. This talk will discuss how supercomputers are unique and explain how that impacts their use.
  4. Testing and Documenting your Code (2016-06-15)
    • Presenter: Alicia Klinvex, SNL
    • Archives: Slides (PDF) | Video (YouTube)
    • Description: Software verification and validation are needed for high-quality and reliable scientific codes. For software with moderate to long lifecycles, a strong automated testing regime is indispensable for continued reliability. Similarly, comprehensive and comprehensible documentation is vital for code maintenance and extensibility. This presentation will provide guidelines on testing and documentation that can help to ensure high-quality and long-lived HPC software. We will present methodologies, with examples, for developing tests and adopting regular automated testing. We also will provide guidelines for minimum, adequate, and good documentation practices depending on the available resources of the development team.
  5. Distributed Version Control and Continuous Integration Testing (2016-06-02 )
    • Presenter: Jeff Johnson, LBNL
    • Archives: Slides (PDF) | Video (YouTube)
    • Description: Recently, many tools and workflows have emerged in the software industry that have greatly enhanced the productivity of development teams. GitHub, a site that hosts projects in Git repositories, is a popular platform for open source and closed source projects.  GitHub has encoded several best practices into easily followed procedures such as pull requests, which enrich the software engineering vocabularies of non-professionals and professionals alike.  GitHub also provides integration to other services (for example, continuous integration such as Travis CI, which allows code changes to be automatically tested before they are merged into a master development branch).   This presentation will discuss how to set up a project on GitHub, illustrate the use of pull requests to incorporate code changes, and show how Travis CI can be used to boost confidence that changes will not break existing code.
  6. Developing, Configuring, Building, and Deploying HPC Software (2016-05-18)
    • Presenter: Barry Smith, ANL
    • Archives: Slides (PDF) | Video (YouTube)
    • Description: The process of developing HPC software requires consideration of issues in software design as well as practices that support the collaborative writing of well-structured code that is easy to maintain, extend, and support.  This presentation will provide an overview of development environments and how to configure, build, and deploy HPC software using some of the tools that are frequently used in the community.  We will also discuss ways in which these and other tools are best utilized by various categories of scientific software developers, ranging from small teams (for example, a faculty member and graduate students who are writing research code intended primarily for their own use) through moderate/large teams (for example, collaborating developers spread among multiple institutions who are writing publicly distributable code intended for use by others in the community).
  7. What All Codes Should Do:  Overview of Best Practices in HPC Software Development (2016-05-04)
    • Presenter: Anshu Dubey, ANL
    • Archives: Slides (PDF) | Video (YouTube)
    • Description: Scientific code developers have increasingly been adopting software processes derived from the mainstream (non-scientific) community.  Software practices are typically adopted when continuing without them becomes impractical. However, many software best practices need modification and/or customization, partly because the codes are used for research and exploration, and partly because of the combined funding and sociological challenges. This presentation will describe the lifecycle of scientific software and important ways in which it differs from other software development.  We will provide a compilation of software engineering best practices that have generally been found to be useful by science communities, and we will provide guidelines for adoption of practices based on the size and the scope of the project.