By Rob Farber, contributing writer for the Exascale Computing Project
Clacc is a software development effort funded by the PROTEAS-TUNE project of the US Exascale Computing Project (ECP) to develop the OpenACC compiler production support for Clang and the LLVM Compiler Infrastructure Project (LLVM). The Clacc project page states: “The OpenACC support in Clang and LLVM makes it easy to program GPUs and other accelerators in DOE applications and provides a popular compiler platform on which research and development for related optimizations and tools (e.g. B. static) can be performed analyzers, debuggers, editor extensions). ” [i] OpenACC continues to be the second most popular programming model for GPUs on the ORNL Summit supercomputer.
Joel Denny, computer scientist at ORNL and member of the ECP team for software technology development tools, stated: “When the Clacc project began, NVIDIA was the dominant OpenACC compiler provider. The Clacc project was initiated to bring a new, open source, production-quality OpenACC compiler option to the HPC and science communities. Denny also noted that “DOE has given a strong push towards LLVM. It makes sense to use this ecosystem to support DOE and the OpenACC users. “Currently the Clacc project is focused on functionality completeness. Although compiler-based performance optimizations are not a current focus, preliminary benchmark results show that Clacc can deliver acceptable GPU performance.
OpenACC is a relatively new programming standard that was introduced in 2010 to provide a portable policy-based programming model for the C, C ++, and Fortran computer languages. The OpenACC standard developed jointly by Cray, NVIDIA and PGI is intended to simplify the parallel programming of heterogeneous CPU / GPU systems. [iii] [iv] The OpenACC organization realizes that one goal of OpenACC is to help the research and development communities advance science by expanding their accelerated and parallel computing skills.[v] The OpenACC support for the open source Clang and LLVM projects described below uses the extensive efforts these projects have made in recent years to provide a production-quality open source parallel compiler and runtime system for support by OpenMP to create standard.
There is a natural synergy between the OpenACC and OpenMP compiler front ends and the runtime systems. There are differences, of course, but by and large, both OpenMP and OpenACC are policy-based standards that contain programming instructions called pragmas that are used by programmers to create applications that use the parallel functionality of multi-core CPUs and massively parallel Using accelerators like GPUs. The Clacc project highlights the generality of the Clang Compiler and LLVM projects as compiler writers can leverage the work of others if they support one or both of the OpenACC and OpenMP programming standards.
Use of LLVM compiler and toolchain technologies
The LLVM Compiler Infrastructure Project (LLVM) is an open source collection of compiler and toolchain technologies. Doug Kothe, director of the US Department of Energy’s (DOE) Exascale Computing Project, believes LLVM compiler technology will become the hub for compiler development and advancement by vendors and communities.
LLVM is becoming so widespread that Johannes Doerfert, a researcher at Argonne National Laboratory, states, “A lot of companies and organizations work together on LLVM, which is one of the many advantages of using the LLVM compiler infrastructure. LLVM-based compilers from the system manufacturers are widespread in the entire HPC community, ”notes Doerfert. “Collaboration improvements and LLVM improvements benefit the entire HPC community, including system builders, software suppliers and end users.
The PROTEAS-TUNE project complements and works together with the SOLLVE project. [vi],[vii] This is an ECP effort focused on standardizing HPC functions in OpenMP and developing an efficient, portable, and complete implementation in the LLVM compiler framework. [viii]
These projects are possible because of the highly permissible terms of the LLVM license agreement, which allows the HPC community to create and share software using the LLVM compiler infrastructure. This includes profilers, parallel compilers, debuggers, Domain Specific Languages (DSLs), and new programming models. It also means the HPC community doesn’t have to go through the process of filing a bug report with a compiler or hardware vendor and waiting / hoping for a fix. Instead, HPC developers can find fixes and send them to the open source code base.
Use of the Clang compiler front-end to support OpenACC
Clang is a compiler front end for the C family of computer languages that includes both C and C ++. Clang fully supports the OpenMP 4.5 standard. The ECP-funded SOLLVE project is working to provide the functionality of the OpenMP 5.1 specification for LLVM-based compilers. Analogous to OpenACC, the OpenMP 5.1 specification is intended to strengthen and optimize functions that support the handling of accelerators such as GPUs. [ix]
Clacc offers two paths to an executable binary file
An essential feature of the Clacc design is the translation of OpenACC into OpenMP, which takes advantage of the extensive efforts that have already been put into the LLVM OpenMP compiler and runtime support in recent years. These two paths for generating an executable file are shown in Figure 1. Depending on what the programmer desires, Clacc can follow a direct path from the source code to the LLVM intermediate representation (CodeGen) or be used as a source-to-source translator (RewriteOpenACC) to convert the OpenACC code into OpenMP source code.
Both modes have advantages:
- CodeGen: When this option is selected, Clacc translates the OpenACC source directly into an executable binary file. This is similar to the behavior of the NVIDIA and GCC compilers. The programmer only sees the creation of a binary file, so they are not exposed to OpenMP, which Clacc uses internally as an intermediate representation.
- RewriteOpenACC: In this mode, Clacc translates the OpenACC source into an OpenMP source, which is then compiled with an OpenMP compiler to generate the executable file. This mode has several possible uses. It is intended to be used for targeting other OpenMP compilers and tools besides the upstream clang. It is also intended for porting applications. To better suit these use cases, source code to source code translation avoids the preprocessor extensions and loss of comments and formatting that sometimes occur when translating C-like languages.
According to Denny, Clacc currently takes a straightforward approach when mapping OpenACC code to OpenMP, be it for Clacc’s internal use or for later compilation by an OpenMP compiler. He explains that the three levels of OpenACC parallelism (e.g. gang, worker, vector lane) are assigned to the OpenMP equivalents (teams, threads and SIMD lanes). Denny notes that alternative approaches are also being explored.
Use profiling infrastructure
Profiling is a prerequisite for any parallel programming language – especially if it is developed to support the various hardware platforms used or if it is to be used in the DOE complex.
Clacc supports the OpenACC Profiling Interface, a key component of the OpenACC specification, which standardizes an interface on which profiling tools and libraries in all OpenACC implementations can depend.[x] Such information can be collected and viewed using powerful tools such as the Tuning and Analysis Utilities (TAU) performance system.[xi] TAU is also funded by ECP’s PROTEAS-TUNE project.
A current IEEE paper for 2020[xii] The Clacc and Tau teams explain the support of Clacc profiling for OpenACC in more detail and present sample visualizations for several SPEC ACCEL OpenACC benchmarks with the TAU performance tool. The paper claims that the performance expense involved is negligible.
With TAU, HPC programmers can measure performance and identify bottlenecks through a single profiling tool that works well across a wide range of very different HPC systems, architectures, languages, and software / hardware execution models.
TAU can be installed through Spack and is sold in the Extreme-Scale Scientific Software Stack (E4S). Just install TAU with the target back end CUDA, ROCm, L0 for OneAPI, etc.
The programming model based on the OpenACC directive offers a simple yet powerful approach to accelerators without significant programming effort. The Clacc project is working to provide an open source OpenACC compiler and source code translation facility for HPC and scientific communities.
Rob Farber is a global technology consultant and writer with an extensive background in HPC and machine learning technology development for use in national laboratories and commercial organizations. Rob can be reached at [email protected]
[ii] As of March 1st, 2021
[xii] OpenACC profiling support for Clang and LLVM with Clacc and TAU, Camille Coti, Joel E. Denny, Kevin Huck, Seyong Lee, Allen D. Malony, Sameer Shende and Jeffrey S. Vetter, ProTools, GA, USA (November 2020)