[SPONSORED CONTENT] How many researchers can say that they not only performed their scientific work on the AMD-powered Frontier supercomputer, the world’s #1 HPC system and the first Exascale-class machine, but also on the Fugaku, Summit and Perlmutter , the second, fifth or eighth HPC systems worldwide (see TOP500 list)?
But such is the case with an international group of researchers working on particle-in-cell simulations who have developed code that won this year’s Association for Computer Machinery (ACM) Gordon Bell Award (see related news item) for outstanding achievement won in HPC.
Another team of researchers from the US Department of Energy’s Oak Ridge National Laboratory (which hosts the Oak Ridge Leadership Computing Facility Frontier), the Georgia Institute of Technology, and the University of California San Francisco were credited with using Frontier and data mining techniques for Machine Learning Nominated Search millions of medical and scientific papers and publications for overlooked, potential treatments for illnesses and diseases.
Frontier, powered by 3rd Gen AMD EPYC processors and AMD Instinct MI250x accelerators and built on the HPE Cray EX supercomputing architecture, was shipped to the US Department of Energy’s Oak Ridge National Laboratory late last year. It was then tested and tuned for months, resulting in the system being the first to exceed exascale (one billion billion).  calculations per second) in time for last spring’s biennial TOP500 list.
Frontier tuning is still ongoing and full user readiness is expected early next year. But the two Gordon Bell nominees show that important scientific research on the system is already underway.
First, let’s look at the award-winning team at the DOE’s Lawrence Berkeley National Laboratory, Lawrence Livermore National Laboratory, and the French Alternative Energies and Atomic Energy Commission (CEA), who developed new particle accelerator simulation code called WarpX. The software is the first mesh-refined, particle-in-cell (MR PIC) code for kinetic plasma simulations optimized for parallel computing on Frontier, Fugaku, Fugaku, Summit and Perlmutter.
“The MR-PIC code enabled 3D simulations of laser-matter interactions…” the researchers explained in the abstract of an article about their work, “that were previously beyond the reach of standard codes. These simulations have helped to overcome a major limitation of compact laser-based electron accelerators, which are promising candidates for next-generation high-energy physics experiments and ultra-high-dose-rate FLASH radiotherapy.”
Accelerators aim to break up accelerated particles in target materials to enable the study of the properties of matter. Jean-Luc Vay, senior scientist and project leader at Berkeley, said WarpX is a development to simulate plasma-based accelerators used in a range of fields, including cancer treatments and semiconductor manufacturing. The goal is to study whether small, relatively inexpensive particle accelerators can do some of the work of large accelerators such as the 16-mile-long Large Hadron Collider in Switzerland or the Spallation Neutron Source at Oak Ridge, a mammoth project that has taken years to construct.
We spoke to four of the researchers about their work on Frontier.
On the software front, Berkeley Lab’s Axel Huebl said that researchers worked on an early iteration of Frontier under the auspices of the Exascale Computing Project (ECP) to tune the system for full WarpX runs. This meant that by July, Frontier was able to run the code on around 8,500 of Frontier’s 9,400 nodes.
“We went through the ECP to provide feedback to the vendors, and once Frontier was ready, we were ready to go,” Heubl said.
As for performance, he said, “We ran (WarpX) on every major machine in the world we could get our hands on, and Frontier is more powerful than any predecessor we’ve had before. We were very grateful that we could use it. In terms of performance, we needed large 3D simulations with very fine resolution for the in-game physics… At 8,500 nodes, Frontier performed really efficiently, which means we can now run much larger scientific cases.”
The second team of Frontier users nominated for the Gordon Bell Prize – for advanced data mining of medical information – delivered another tour de force for the new supercomputer. In fact, the researchers said Frontier exceeded an ExaFLOP while running the application.
A major problem in medicine is that research information is beyond human capacity. So the research team, led by Ramakrishnan Kannan, group leader for discrete algorithms at ORNL, began working on a graph-theoretic approach to data mining. The goal: Search for scientific articles, especially biomedical literature, to discover unknown connections between concepts. An example is an environmental agency discovering a previously unnoticed link between a toxin and a disease. Another is a pharmaceutical lab discovering a previously overlooked drug candidate for a disease.
Efforts began by focusing on COVID-19 research.
“We started with the CORD-19 dataset, which contains publications and preprints on COVID-19 and other coronaviruses like SARS and MERS,” Kannan said. “This data set contains over 1 million papers, which alone represents a very large amount of knowledge. But then we combine it with the PubMed dataset, which contains 34 million articles on life science and biomedical topics, some of which date back to 1809. In the end, we were faced with the problem of examining over 300,000 concepts, connected by over 100 million relationships, a far larger corpus of data than any single human being can explore.”
Kannan said one goal of the project is to improve existing medical knowledge graphs like SPOKE (Scalable PrecisiOn Medicine Knowledge Engine) operated by the University of California San Francisco. It combines data from over 30 sources and contains 3 million nodes and 15 million edges.
“The scientific question is whether we can automatically find links that haven’t been discovered yet,” Kannan said. ‘The answer is yes.’ We discovered 181 paths that partially existed in SPOKE, and 159 paths that didn’t exist at all in SPOKE.Each new path discovered means that the algorithm has identified an important connection that hasn’t been explicitly captured from human-curated sources.Each new pathway can be a new connection between a symptom and a disease or point to a new drug candidate and can therefore have life-saving consequences.”
Kannan said that Frontier performed superbly – the application running on the system exceeded the performance of one exaFLOPS.
“The highest performance we know of for this type of problem was achieved in 2020 with the Summit system at ORNL — 136 petaFLOPS,” he said, “which means we’ve improved performance by more than sevenfold. The 7x improvement gives us the ability to complete in days the amount of work that used to take weeks. This is also a historic milestone for a real application in graph analytics.”
He added that Frontier’s power is elusive.
“Crossing an exaFLOPS still feels a little surreal,” Kannan said, “even more so crossing an exaFLOPS when you’re actually doing useful science. A billion billion operations per second is beyond human imagination. If all 8 billion people on Earth performed one calculation per second, it would take them over 4 years to do the amount of work that Frontier can do in one second. But that’s exactly why we can find all the shortest paths in a graph with 30 million nodes and 120 million edges.”
Kannan was also impressed with Frontier’s ease of use. “It still surprises me at times how similar the x86 software environment on Frontier is to my Threadripper plus Radeon VII Linux box. Many mainstream development tools are readily available on Frontier because it is an x86 system running Linux.”
Part of this can be attributed to AMD’s ROCm software stack for programming GPUs, which is completely open source. “On a couple of occasions we’ve been asked for a version of a library with a secret sauce and ended up directing people to GitHub because it’s all openly out there. Notably, the code we developed for Gordon Bell’s submission does not contain any secret ingredient. The GPU kernel that made the exaFLOPS possible is written in pure C++, no assembly, no proprietary extensions of any kind. The Clang compiler in the public release of ROCm has done its job. We hope it leaves no doubt as to the strength of AMD’s commitment to open source software.”