Berkeley Lab Debuts Perlmutter, World’s Fastest AI Supercomputer

A ribbon-cutting back ceremony held nearly at Berkeley Lab’s National Vitality Analysis Scientific Computing Heart (NERSC) this day marked the genuine open of Perlmutter – aka NERSC-9 – the GPU-accelerated supercomputer built by HPE in partnership with Nvidia and AMD. The HPE Cray EX supercomputer harnesses 6,159 Nvidia A100 GPUs and ~1,500 AMD Milan CPUs to narrate nearly 3.8 exaflops of theoretical “AI efficiency” (peep endnote) or about 60 petaflops of peak double-precision (long-established FP64) HPC efficiency.

The system is the namesake of Saul Perlmutter, an astrophysicist at Berkeley Lab who shared the 2011 Nobel Prize in Physics for his contributions to analyze exhibiting that the growth of the universe is accelerating. So it’s fitting that one among the preliminary employ cases for the Perlmutter supercomputer would per chance be in pork up of the Unlit Vitality Spectroscopic Instrument (DESI), which is probing the discontinue of darkish energy on the universe’s growth.

The Perlmutter system will abet procedure the considered universe spanning 11 billion gentle years by processing recordsdata from DESI, which is ready to capturing as many as 5,000 galaxies in a single exposure.

In elaborate to perceive where to level this costly instrument every night, researchers recognize to evaluate the records from the night before. Perlmutter can analyze dozens of exposures hasty enough to offer this feedback in time for the following nightly cycle.

In early benchmarking, NERSC researchers recognize reported up to 20X efficiency speedups the usage of the GPUs, which they are saying will accelerate up their workflows from a subject of weeks or months correct down to hours.

Materials science is anticipated to connect identical benefits, laying the model for advances in batteries and biofuels. Capabilities equivalent to Quantum Espresso leverage Perlmutter’s used simulation and machine learning capabilities, enabling scientists to take into myth extra atoms over an extended interval of time.

“Within the past it used to be very unlikely to discontinue entirely atomistic simulations of massive systems admire battery interfaces, but now scientists thought to make employ of Perlmutter to discontinue correct that,” said Brandon Cook, an purposes efficiency specialist at NERSC.

Nvidia reported that Quantum Espresso, BerkeleyGW and NWChem all are able to leveraging Nvidia’s FP64 Tensor Cores, unlocking double the efficiency of the long-established FP64 structure — 19.5 teraflops versus 9.7 teraflops (peak theoretical) per GPU. (Nvidia reports that Perlmutter affords 120 petaflops of peak FP64 Tensor Core efficiency.)

NERSC’s unique “platform built-in storage” replaces NERSC’s previous burst buffer tier and disk-primarily primarily based entirely scratch tier. Ride courtesy of Glenn Lockwood, SC20 presentation (hyperlink to HPCwire coverage)

The first half of Perlmutter spans 12 GPU-accelerated Cray EX cupboards (aka “Shasta”) housing greater than 1,500 nodes and 35 petabytes of all-flash parallel file system (HPE E1000). The Lustre filesystem will cross recordsdata at a rate of greater than 5 terabytes/sec making it the fastest storage system of its kind, consistent with NERSC.

The Perlmutter system is verbalize liquid cooled and makes employ of HPE’s Cray-developed Slingshot interconnect know-how.

A 2d CPU-handiest half is deliberate for later this year. Phase 2 adds 12 CPU cupboards with greater than 3,000 nodes, geared up with two AMD Milan CPUs with 512GB of memory per node. The Phase 2 system furthermore adds 20 extra login nodes and 4 principal memory nodes, consistent with NERSC.

Perlmutter is the successor to Cori (named in honor of Nobel Prize-winning biochemist Gerty Cori), which used to be furthermore constructed as two partitions, the Phase 1 Intel Haswell-primarily primarily based entirely “Recordsdata Partition” and the Phase 2 Intel Knights Touchdown (Xeon Phi) partition. Cori is the largest supercomputing system for commence science consistent with KNL processors. NERSC will proceed to characteristic Cori thru no longer decrease than 2022.

On the tool facet, Perlmutter customers can recognize access to the long-established NVIDIA HPC SDK toolkit, and pork up for OpenMP is drawing shut thru a joint pattern effort with NERSC.

Python programmers would per chance be in a arena to make employ of RAPIDS, Nvidia’s commence tool suite for GPU-enabled recordsdata science.

Phase 1 cupboards were deployed over the previous few months but even before set up started in November 2020, the NERSC Exascale Science Capabilities Program (NESAP) used to be engaged in readiness activities so that you just would possibly perhaps well leverage the GPU nodes for simulation, recordsdata, and learning purposes starting up on day one. NERSC reports that these NESAP readiness teams would per chance be the important thing to access the system. Beef up for Exascale Computing Project (ECP) tool is furthermore deliberate on the unique system.

Perlmutter high-level structure procedure

AI for Science

With its sturdy AI capabilities, Perlmutter ties into the DOE’s AI for Science focal level dwelling, an exascale-admire initiative for advancing the usage of AI in science.

“AI for science is a announce dwelling on the U.S. Division of Vitality, where proof of ideas are getting in manufacturing employ cases in areas admire particle physics, materials science and bioenergy,” said Wahid Bhimji, performing lead for NERSC’s recordsdata and analytics services and products neighborhood, in an Nvidia blog post.

“Of us are exploring elevated and elevated neural-community units and there’s a attach an tell to for access to extra worthy sources, so Perlmutter with its A100 GPUs, all-flash file system and streaming recordsdata capabilities is successfully timed to meet this need for AI,” he added.

Presenting in a pre-recorded video as half of this day’s digital open program, Nvidia CEO Jensen Huang underscored rising HPC and AI synergies.

“Perlmutter’s ability to fuse AI and high efficiency computing will end result in breakthroughs in a large fluctuate of fields from materials science and quantum physics to climate projections, biological research and extra,” Huang said.

Searching Ahead (to Quantum)

Planning is already underway for the put collectively-ons to Perlmutter, codenamed NERSC-10 and NERSC-11.

“Techniques snatch years and years for us to influence and deploy,” said NERSC Director Sudip Dosanjh accurate thru this day’s digital dedication ceremony.

“It’s pleasing sure that we’ll recognize extra heterogeneous systems as we enter the post-Moore’s guidelines generation. We’re having a receive out about at a form of kinds of accelerators. I don’t reflect that it’s likely that NERSC-10 can recognize a quantum accelerator, but NERSC-11 completely would per chance well. Half the codes that speed at NERSC resolve some form of quantum mechanical scenario, and that half of the workload would per chance well really snatch pleasure in a quantum accelerator.

“With NERSC-10, we’re really going to focal level on discontinue-to-discontinue DOE Residence of work of Science workflows, and optimistically enable unique modes of scientific discovery thru the mix of experiment, recordsdata evaluation and simulation. And so no longer handiest will we are looking to make certain that the scientists can employ AI to match the records, but we furthermore are looking to make employ of AI to administer the system to build up bigger the reliability of the system and the energy efficiency of the system. And besides now we recognize a purpose of the usage of AI to reconfigure NERSC-10 to accelerate up workflows,” said Dosanjh.

Hi there Perlmutter — Saul Perlmutter inaugurates Perlmutter in a live demo:

Mark: Perlmutter’s “AI efficiency” is consistent with Nvidia’s half-precision numerical structure (FP16 Tensor Core) with Nvidia’s sparsity feature enabled.

Read Extra