Technology

The jaw-dropping engineering behind America’s next world-beating supercomputer

by Jeff Francis
September 12, 2017

The United States is falling behind. No, I’m not talking about quality of life, medical outcomes, or median incomes; I’m talking about what may be the 21st century’s version of the space race.

I’m talking about supercomputing. And, it could be every bit as imperative to our future as space exploration was 50+ years ago.

At the moment, the two fastest supercomputers in the world both reside in China. Supercomputer processing power is measured in Floating Point Operations per Second (FLOPs). The top supercomputers in the world are measured in petaflops, or 10¹⁵ FLOPs (otherwise known as a quadrillion FLOPs… or a million billion FLOPs. At some point, the ways we talk about large numbers get somewhat ridiculous, I know).

The Sunway TaihuLight has 10.6 million cores pumping out 93+ petaflops in the Linpack Benchmark test. That’s more than the next four supercomputers combined. In the 24-year history of the TOP500 rankings, the U.S. has only ever been outside the top 3 twice — once in November of 1996, when three Japanese supercomputers helmed the list, and right now.

The second fastest supercomputer is also in China; the Tianhe-2 cranks at 33.8 petaflops, while the top Swiss system, Piz Daint, comes in third at 19.6.

When it was built, the Department of Energy’s Titan supercomputer at Oak Ridge National Laboratory was tops in the world, reaching speeds of 17.6 petaflops via 560,000+ cores — but that was back in 2012. And as supercomputers go, it had serious staying power, remaining in the top three in the world until June of this year, 5 years on. But, when compared with the otherworldly speed the two Chinese systems boast, Titan is no longer the monster it once was.

And while this may seem like trivial, bragging-rights-based superlative chasing, controlling the world’s fastest supercomputers confer massive advantages to that country or entity. Answering the most complex, important and impactful questions in quantum physics, materials engineering, weather forecasting, genome sequencing, climate science, nuclear reaction processing, molecular modeling, and next generation cryptanalysis all depend on the computing power and simulation abilities of the world’s most capable supercomputers. Having that power at your fingertips truly opens a world of knowledge and innovation for the country that controls that respective computing power.

After moving to a disappointing 4th on the list, the U.S. is poised to take its place at the peak in the near future.

Enter Summit, Titan‘s successor at Oak Ridge, due for completion next year. It’s slated to be 5-10 times more powerful than Titan, topping out somewhere near 200 petaflops (which, if you’ll recall, is more than twice as fast as China’s top machine today).

Summit will be built using the same general architecture as Titan, using cabinet after cabinet of so-called ‘nodes.’ The interesting thing about Summit, though, is the build-out within each of those nodes. With Titan, each of the nearly 19,000 nodes consisted of one CPU and one GPU. Summit, on the other hand, will have nodes consisting of two CPUs working alongside six GPUs.

Stacking GPUs to turbocharge complex computing systems mirrors what we’re seeing from most of the private sector leaders in artificial intelligence, machine learning and neural networks, which shouldn’t be all that surprising. Not all supercomputers use this build factor, called heterogeneous architecture, but the boost in chip performance is remarkable.

The other remarkable aspect of this project? Cooling the beast.

Nothing saps performance faster than heat — you overheat the machine, it breaks down, simple as that. Each of Summit’s 4,600 nodes has to be cooled individually given the immense heat each throws off. Jim Rogers, director for computing and facilities, said to Wired: “Every one of those nodes is using a cold plate technology, where we’re putting water through a cold plate that’s directly on top… So 70 percent of the heat that’s generated by this thing can be absorbed by that cold plate.”

Matt Simon, the article author, continues:

Curiously, this isn’t super-chilled water―it’s a comfortable 70 degrees Fahrenheit. Why? Because if you drop the temperature too much, you’ll form dew, which is a great way to ruin a supercomputer. “You have to have higher flow rates to carry the heat away,” Rogers says (we’re talking a max flow of nearly 8,000 gallons per minute), “but that tradeoff is good in terms of energy efficiency and operating cost.”

Summit is an engineering masterpiece that will power some of the most exciting science, engineering, discovery and innovation in the world. Accomplishments like it are a testament to the human spirit and necessary investments in our collective future. And while 200 petaflops might not mean much as an abstraction, what it represents is sacrosanct — the primal quest for knowledge and understanding.

Comments

The jaw-dropping engineering behind America’s next world-beating supercomputer

Leave a Reply Cancel reply

Jeff Francis

Get In The Know

EXPERTISE, ENTHUSIASM & ENO8: AT YOUR SERVICE

Let’s Work Together

Quick Links

Contact

The jaw-dropping engineering behind America’s next world-beating supercomputer

Leave a Reply Cancel reply

Jeff Francis

Get In The Know

EXPERTISE, ENTHUSIASM & ENO8: AT YOUR SERVICE

When Will Your Software Need to Be Rebuilt?

When the software starts hobbling and engineers are spending more time fixing bugs than making improvements, you may find yourself asking, “Is it time to rebuild our software?” Take this quiz to find out if and when to rebuild.